{ "info": { "author": "Chengsheng Zhu, Maximilian Miller", "author_email": "mmiller@bromberglab.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Science/Research", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "# mi-faser #\n### microbiome - functional annotation of sequencing reads ###\n\nA super-fast ( < 20min/10GB of reads ) and accurate ( > 90% precision ) method for annotation of\nmolecular functionality encoded in sequencing read data without the need for assembly or gene finding.\n\n**Web Service:** https://bromberglab.org/services/mifaser/\n\n**Docker:** A pre-build docker image is available at https://hub.docker.com/r/bromberglab/mifaser\n\n## Pre-Requirements ##\n\n*mi-faser* runs on **LINUX**, **MacOSX** and **WINDOWS** systems.\n\n**Dependencies**\n\n* Python >= 3.6\n* DIAMOND >= 0.8.8 (included; sources: https://github.com/bbuchfink/diamond)\n* WINDOWS: Visual C++ Redistributable *\n\n**Note:** *mi-faser* was developed and optimized using **DIAMOND v0.8.8**, which is included in all release up to v1.11.4. This is also the version used in the accompanying publication [1]. All newer releases of *mi-faser* use the latest stable release of *DIAMOND*. *mi-faser* results for the first release (v1.2) with an updated version of *DIAMOND* (v0.9.13) were not affected by this (<0.1% difference; based on results for the artificial metagenome supplied as example dataset). According to the authors, more recent versions of DIAMOND offer substantial improvements regarding speed and memory usage as well as bugfixes. Thus, we strongly recommend to always use the latest version of DIAMOND (see Section: *DIAMOND upgrade*). This might alter *mi-faser* results slightly. However, results are expected to be enriched by new correct annotations rather than introducing mis-annotations.\n\nNote that it is recommended to download and **compile DIAMOND locally** (https://github.com/bbuchfink/diamond) as this might have a\nsignificant impact on performance (due to special CPU instructions).\nHowever, this repository includes a pre-compiled version of DIAMOND to use.\n\nNote that different split sizes could, at very rare occasions, result in minor deviations in *mi-faser* annotations. This is due to certain heuristics applied by DIAMOND when generating sequence alignments. We suggest to retain the split size for comparable analyses.\n\n**Optional extensions**\n\n* SRA Toolkit >= 2.9.1 ([NCBI](https://www.ncbi.nlm.nih.gov/sra/docs/toolkitsoft/))\n\n If installed enables *mi-faser* to automatically retrieve and process read files deposited in the NCBI Sequence Read Archives [SRA](https://www.ncbi.nlm.nih.gov/sra). Currently SRR, ERR and DRR identifiers are suppotted.\n\n## Reference Database ##\n\n*mi-faser* was developed using a manually curated reference database of protein functions (*GS* database; [DOI 10.5281/zenodo.1048269](https://doi.org/10.5281/zenodo.1048268)).\n\nSince version 1.5 *mi-faser* also contains a new *GS+* database, which extends the default *GS* database. The *GS+* database includes additional 55 manually curated protein sequences, introducing 28 new E.C.s that represent important microbial functions in the environment.\n\nTo create an new reference database, refer to the paragraph *Creating a reference database*.\n\n## Installation ##\n\n**Standalone *VS* Web Service**\n\nThe Standalone version of *mi-faser* partitions the user input into subsets analogue to the Web Service (http://services.bromberglab.org/mifaser/). However, those partitions are processed sequentially and not in parallel as in the Web Service.\nThus the Standalone Version is only recommended for smaller jobs and is mainly thought to provide the *mi-faser* code base.\n\n**Python package**\n*mi-faser* is available as python package. To install *mi-faser* using pip run:\n```\npip install mifaser\n```\n*mi-faser* can the be used directly from the command line:\n```\nmifaser\n```\nThe *mi-faser* module can be imported in a Python project by `import mifaser`.\n\n**Docker**\n\nThe pre-build *mi-faser* docker image is probably the most convenient way to run *mi-faser* locally or in any cloud infrastructure. The docker image can be used in the same way as the standalone version, however mounting of a common working directory into the virtual environment is required.\n\nTo create and execute a single instance of *mi-faser* using a locally mounted working directory run:\n```\ndocker run --rm \\\n -v :/input \\\n -v :/output \\\n bromberglab/mifaser -f \n```\n is a valid *mi-faser* input file located in on your host environment. By default, *mi-faser* reads inputfiles relative to `/input` and writes any output to `/output`. Thus, by bind mounting your local to `/input` inside the docker container, input files can be passed simply as relative paths to your . Similarly, by mounting a to `/output` inside the docker container, all *mi-faser* outputs can be accessed at the .\n\n**Python source (git repository)**\n\nOpen a terminal and checkout the *mi-faser* repository:\n```\ngit clone https://git@bitbucket.org/bromberglab/mifaser.git\n```\nor download the zipped version:\n```\ncurl --remote-name https://bitbucket.org/bromberglab/mifaser/get/master.zip\nunzip master.zip\n```\n\n## Usage ##\n\n**In case *mi-faser* was downloaded using the git repository:**\n\n * navigate to the *mi-faser* repository base directory\n * all examples in the following documentation have to be run using `python -m mifaser` instead of `mifaser`.\n\n **run *mi-faser* (Single or 2-Lane mode)**\n\n**Single:** input-file containing DNA reads, single http[s]/ftp[s] url or SRA accession ID (sra:):\n```\nmifaser -f/--inputfile \n```\n\n**2-Lane:** two files (R1/R2), http[s]/ftp[s] urls or SRA accession IDs (sra: sra:):\n```\nmifaser -l/--lanes \n```\n\n
\n\n## CLI ##\n*mi-faser* help:\n```\nusage: mifaser [-h] [-f INPUTFILE] [-l R1 R2] [-o OUTPUTFOLDER]\n [-d DATABASEFOLDER] [-i DIAMONDFOLDER] [-m] [-s SPLIT]\n [-S [SPLITMB]] [-t THREADS] [-c CPU] [-p] [-n] [-u UPDATE]\n [-D [arg [arg ...]]] [-v] [-q] [--version]\n\nmi-faser, microbiome - functional annotation of sequencing reads\n\na super-fast ( < 10min/10GB of reads ) and accurate ( > 90% precision ) method\nfor annotation of molecular functionality encoded in sequencing read data\nwithout the need for assembly or gene finding.\n\nPublic web service: https://services.bromberglab.org/mifaser\n\nVersion: 1.60 [03/23/20]\n\noptional arguments:\n -h, --help show this help message and exit\n -f INPUTFILE, --inputfile INPUTFILE\n input DNA reads file, http[s]/ftp[s] url or SRA\n accession id (sra:)\n -l R1 R2, --lanes R1 R2\n 2-Lane format (R1/R2) files, http[s]/ftp[s] url or SRA\n accession ids (sra: sra:)\n -o OUTPUTFOLDER, --outputfolder OUTPUTFOLDER\n path to base output folder; default: INPUTFILE_out\n -d DATABASEFOLDER, --databasefolder DATABASEFOLDER\n name of database located in database/ directory OR\n absolute path to folder containing database files\n -i DIAMONDFOLDER, --diamondfolder DIAMONDFOLDER\n path to folder containing diamond binary\n -m, --mapping if flag is set all reads mappings will be generated\n (reads{n=*} -> EC{n=1}, fasta)\n -s SPLIT, --split SPLIT\n split by X sequences; default: 100k; 0 forces no split\n -S [SPLITMB], --splitmb [SPLITMB]\n split by X MB; default: 25; (requires split from GNU\n Coreutils)\n -t THREADS, --threads THREADS\n number of threads; default: 1\n -c CPU, --cpu CPU max cpus per thread; default: all available\n -p, --preserve if flag is set intermediate results are kept\n -n, --no-check if flag is set check for compatibility between diamond\n database and binary is omitted\n -u UPDATE, --update UPDATE\n valid update commands: { diamond[:version] }\n -D [arg [arg ...]], --createdb [arg [arg ...]]\n create new reference database: \n [merge_db=] [update_ec_annotations=<1|0>; default: 0]\n -v, --verbose set verbosity level; default: log level INFO\n -q, --quiet if flag is set console output is logged to file\n --version show program's version number and exit\n\nIf you use *mi-faser* in published research, please cite:\n\nZhu, C., Miller, M., ... Bromberg, Y. (2017).\nFunctional sequencing read annotation for high precision microbiome analysis.\nNucleic Acids Res. [doi:10.1093/nar/gkx1209]\n(https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkx1209/4670955)\n\nmi-faser is developed by Chengsheng Zhu and Maximilian Miller.\nFeel free to contact us for support at services@bromberglab.org.\n\nThis project is licensed under [NPOSL-3.0](http://opensource.org/licenses/NPOSL-3.0)\n\nTest: mifaser -f mifaser/files/test/artificial_mg.fasta -o mifaser/files/test/out\n```\n\n**Example**\n\nA demo dataset containing 10k reads is provided to verify a local *mi-faser* installation. Navigate to the *mifaser* repository base directory and run *mi-faser* with the following arguments:\n```\nmifaser -f mifaser/files/test/artificial_mg.fasta -o mifaser/files/test/out\n```\nThe resulting analysis will be located relative to the *mifaser* base directory at: *mifaser/files/test/out/*.\n\n**DIAMOND upgrade**\n\nAs DIAMOND (https://github.com/bbuchfink/diamond) is actively developed, we provide an easy way to upgrade (or downgrade) to another version.\nIn case a specific version of DIAMOND is given as parameter, this version will be automatically downloaded and installed (default: latest release).\n```\nmifaser --update diamond[:]\n```\n\n**Creating a reference database**\n\n*mi-faser* uses a manually curated reference database of protein functions. To create an alternative reference database, first store the desired set of protein sequences in a multi-FASTA file using the following format for the sequence headers:\n>\\>id|annotation|e.c.-number|additional_details\n\nsequences.fasta:\n```\n>id|annotation|e.c.-number|additional_details\nMKPNTDFMLIADGAKVLTQGNLTEHCAIEVSDGIICGLKSTISAEWTADKPHYRLTSGTL\nVAGFIDTQVNGGGGLMFNHVPTLETLRLMMQAHRQFGTTAMLPTVITDDIEVMQAAADAV\nAEAIDCQVPGIIGIHFEG\n>id|annotation|e.c.-number|additional_details\nMYYGLDIGGTKIELAIFDTQLALQDKWRLSTPGQDYSAFMATLAEQIEKADQQCGERGTV\nGIALPGVVKADGTVISSNVPCLNQRRVAHDLAQLLNRTVAIGNDCRCFALSEAVLGVGRG\nYSRVLGMI\n```\n\nThen run *mi-faser* using the *-D/--createdb* argument to create a new reference database *my_database*:\n\n```\nmifaser -D my_database path/to/sequences.fasta\n```\n\nTo use the new database run:\n\n```\nmifaser -d my_database -f mifaser/files/test/artificial_mg.fasta -o mifaser/files/test/out\n```\n\nSee the *help* menu (--help) for more details.\n\n
\n\n## License ##\n\nThis project is licensed under [NPOSL-3.0](http://opensource.org/licenses/NPOSL-3.0).\n\n## Citation ##\n\nIf you use *mi-faser* in published research, please cite:\n\nZhu, C., Miller, M., Marpaka, S., Vaysberg, P., R\u00fchlemann, M. C., Wu, G. H. F.-A., . . . Bromberg, Y. *(2017)*. Functional sequencing read annotation for high precision microbiome analysis. Nucleic Acids Res. [doi:10.1093/nar/gkx1209](https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkx1209/4670955)\n\n\n## About ##\n\n*mi-faser* is developed by Chengsheng Zhu and Maximilian Miller. Feel free to contact us for support: [services@bromberglab.org](mailto:services@bromberglab.org).\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://bitbucket.org/bromberglab/mifaser", "keywords": "microbiome,metagenome,function annotation", "license": "NPOSL-3.0", "maintainer": "", "maintainer_email": "", "name": "mifaser", "package_url": "https://pypi.org/project/mifaser/", "platform": "", "project_url": "https://pypi.org/project/mifaser/", "project_urls": { "Bug Tracker": "https://bitbucket.org/bromberglab/mifaser/issues", "Documentation": "https://bitbucket.org/bromberglab/mifaser/wiki/docs", "Homepage": "https://bitbucket.org/bromberglab/mifaser", "Source Code": "https://bitbucket.org/bromberglab/mifaser" }, "release_url": "https://pypi.org/project/mifaser/1.60/", "requires_dist": null, "requires_python": ">=3.6", "summary": "a python package for super-fast and accurate annotation of molecular functionality using read data without prior assembly or gene finding", "version": "1.60", "yanked": false, "yanked_reason": null }, "last_serial": 6866699, "releases": { "1.58": [ { "comment_text": "", "digests": { "md5": "96f61adc0a229b2ec9233c69dcd1fe72", "sha256": "5dea327c21a69763160869640dcca62a29ef5c68f68bc751711652d7fcaed6be" }, "downloads": -1, "filename": "mifaser-1.58-py3-none-any.whl", "has_sig": false, "md5_digest": "96f61adc0a229b2ec9233c69dcd1fe72", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 4691214, "upload_time": "2019-10-25T22:21:36", "upload_time_iso_8601": "2019-10-25T22:21:36.084490Z", "url": "https://files.pythonhosted.org/packages/32/94/20cbfe3606cf54172d68df9e73c0648734b413f5a3729867ac86db6367fa/mifaser-1.58-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "8353aee0af84d5c3d7e9c6510f371461", "sha256": "9136bc4d32bda7b0cb7ac621243af1210d73d7af401424768e04b726db049ba3" }, "downloads": -1, "filename": "mifaser-1.58.tar.gz", "has_sig": false, "md5_digest": "8353aee0af84d5c3d7e9c6510f371461", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 4680769, "upload_time": "2019-10-25T22:21:44", "upload_time_iso_8601": "2019-10-25T22:21:44.506709Z", "url": "https://files.pythonhosted.org/packages/ea/b2/3d9d5bded6361d347ad8b085bf082204a7b7cd737926c0fdb8de26de765e/mifaser-1.58.tar.gz", "yanked": false, "yanked_reason": null } ], "1.59": [ { "comment_text": "", "digests": { "md5": "eb2f4a15e191dc7b916cade0d303af11", "sha256": "33e4350d9b825b676d1b65a25fbeb50da6116e5919a4d4a722be0a4bdbf48f6a" }, "downloads": -1, "filename": "mifaser-1.59-py3-none-any.whl", "has_sig": false, "md5_digest": "eb2f4a15e191dc7b916cade0d303af11", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 4691119, "upload_time": "2020-02-20T19:06:00", "upload_time_iso_8601": "2020-02-20T19:06:00.468492Z", "url": "https://files.pythonhosted.org/packages/10/a7/ea903a7f3c47f0d356dae17ebd504a0eb15c85501f1fdf65df11ed5998c2/mifaser-1.59-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "2541cfbf0092b98223977a8ce40ef56b", "sha256": "4fda3bee45048729f247f9e05d11d3998d607c4b09ae3986270d9924769066a8" }, "downloads": -1, "filename": "mifaser-1.59.tar.gz", "has_sig": false, "md5_digest": "2541cfbf0092b98223977a8ce40ef56b", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 4680076, "upload_time": "2020-02-20T19:06:03", "upload_time_iso_8601": "2020-02-20T19:06:03.243781Z", "url": "https://files.pythonhosted.org/packages/45/59/d136dc7f74e3f49f8a87f5b48c2a126f2f91d7e61d3ced954bb3c938471a/mifaser-1.59.tar.gz", "yanked": false, "yanked_reason": null } ], "1.60": [ { "comment_text": "", "digests": { "md5": "49b3b250b674e950c71d0bc8dbba2743", "sha256": "d56115d7f24daeaa2c8e3c75514099957cade3223e821f1b0bb0b52bb21a9405" }, "downloads": -1, "filename": "mifaser-1.60-py3-none-any.whl", "has_sig": false, "md5_digest": "49b3b250b674e950c71d0bc8dbba2743", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 4691179, "upload_time": "2020-03-23T16:32:33", "upload_time_iso_8601": "2020-03-23T16:32:33.876944Z", "url": "https://files.pythonhosted.org/packages/d1/5f/d228ac6081b7c48848b5061f9994238b543eeb18b687cccc9abe3db3589f/mifaser-1.60-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "5ad8d97c52ecb1a459650b004b20a30c", "sha256": "24a2db0c5b733d63ae9266857132aee8912e8fe0c8c1e11ae86727f53b386de3" }, "downloads": -1, "filename": "mifaser-1.60.tar.gz", "has_sig": false, "md5_digest": "5ad8d97c52ecb1a459650b004b20a30c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 4680160, "upload_time": "2020-03-23T16:32:39", "upload_time_iso_8601": "2020-03-23T16:32:39.356493Z", "url": "https://files.pythonhosted.org/packages/18/ad/ae2ab2177810bf512de4696bd909e1fc6cf81f783b5ea27481e67e44a3ba/mifaser-1.60.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "49b3b250b674e950c71d0bc8dbba2743", "sha256": "d56115d7f24daeaa2c8e3c75514099957cade3223e821f1b0bb0b52bb21a9405" }, "downloads": -1, "filename": "mifaser-1.60-py3-none-any.whl", "has_sig": false, "md5_digest": "49b3b250b674e950c71d0bc8dbba2743", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 4691179, "upload_time": "2020-03-23T16:32:33", "upload_time_iso_8601": "2020-03-23T16:32:33.876944Z", "url": "https://files.pythonhosted.org/packages/d1/5f/d228ac6081b7c48848b5061f9994238b543eeb18b687cccc9abe3db3589f/mifaser-1.60-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "5ad8d97c52ecb1a459650b004b20a30c", "sha256": "24a2db0c5b733d63ae9266857132aee8912e8fe0c8c1e11ae86727f53b386de3" }, "downloads": -1, "filename": "mifaser-1.60.tar.gz", "has_sig": false, "md5_digest": "5ad8d97c52ecb1a459650b004b20a30c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 4680160, "upload_time": "2020-03-23T16:32:39", "upload_time_iso_8601": "2020-03-23T16:32:39.356493Z", "url": "https://files.pythonhosted.org/packages/18/ad/ae2ab2177810bf512de4696bd909e1fc6cf81f783b5ea27481e67e44a3ba/mifaser-1.60.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }