{
    "info": {
        "author": "Peter Kruczkiewicz",
        "author_email": "peter.kruczkiewicz@gmail.com",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 2 - Pre-Alpha",
            "Intended Audience :: Developers",
            "License :: OSI Approved :: Apache Software License",
            "Natural Language :: English",
            "Programming Language :: Python :: 3",
            "Programming Language :: Python :: 3.6",
            "Programming Language :: Python :: 3.7",
            "Topic :: Scientific/Engineering",
            "Topic :: Scientific/Engineering :: Bio-Informatics"
        ],
        "description": "=======================\nfilter_classified_reads\n=======================\n\n\n.. image:: https://img.shields.io/pypi/v/filter_classified_reads.svg\n        :target: https://pypi.python.org/pypi/filter_classified_reads\n\n.. image:: https://travis-ci.org/peterk87/filter_classified_reads.svg?branch=master\n    :target: https://travis-ci.org/peterk87/filter_classified_reads\n\n.. image:: https://readthedocs.org/projects/filter-classified-reads/badge/?version=latest\n        :target: https://filter-classified-reads.readthedocs.io/en/latest/?badge=latest\n        :alt: Documentation Status\n\n\n\n\nFilter for reads from taxa of interest using Kraken2/Centrifuge classification results.\n\n\n* Free software: Apache Software License 2.0\n* Documentation: https://filter-classified-reads.readthedocs.io.\n\n\nFeatures\n--------\n\n* Filter for union of reads classified to taxa of interest Kraken2_ and Centrifuge_ (by default filter for Viral reads (taxid=10239))\n* Output unclassified reads along with reads from taxa of interest *or* exlude them with `--exclude-unclassified`\n* seqtk_ for quickly filtering reads and pbgzip_ for parallel block Gzip compression of output reads (recommended that these dependencies are installed with Conda_)\n\nUsage\n-----\n\nPaired-end reads with classification results by both Kraken2_ and Centrifuge_\n\n.. code-block::\n\n    filter_classified_reads -i /path/to/reads/R1.fq \\\n                            -I /path/to/reads/R2.fq \\\n                            -o  /path/to/reads/R1.filtered.fq.gz \\\n                            -O  /path/to/reads/R2.filtered.fq.gz \\\n                            -k  /path/to/kraken2/results.tsv \\\n                            -K  /path/to/kraken2/kreport.tsv \\\n                            -c  /path/to/centrifuge/results.tsv \\\n                            -C  /path/to/centrifuge/kreport.tsv \\\n\n\nUsing test data in `tests/data/`:\n\n.. code-block::\n\n    $ filter_classified_reads -i tests/data/SRR8207674_1.viral_unclassified.seqtk_seed42_n10000.fastq.gz \\\n                              -I tests/data/SRR8207674_2.viral_unclassified.seqtk_seed42_n10000.fastq.gz \\\n                              -o r1.fq.gz \\\n                              -O r2.fq.gz \\\n                              -k tests/data/SRR8207674-kraken2_results.tsv \\\n                              -K tests/data/SRR8207674-kraken2_report.tsv \\\n                              -c tests/data/SRR8207674-centrifuge_results.tsv \\\n                              -C tests/data/SRR8207674-centrifuge_kreport.tsv\n\nYou should see the following log information:\n\n.. code-block::\n\n    2019-04-16 13:40:34,114 INFO: Parsing centrifuge results into DataFrame [in target_classified_reads.py:49]\n    2019-04-16 13:40:34,168 INFO: Parsed n=12281 centrifuge result records into DataFrame from \"tests/data/SRR8207674-centrifuge_results.tsv\" [in target_classified_reads.py:57]\n    2019-04-16 13:40:34,172 INFO: Parsed n=298 centrifuge Kraken-style report records into DataFrame from \"tests/data/SRR8207674-centrifuge_kreport.tsv\" [in target_classified_reads.py:60]\n    2019-04-16 13:40:34,177 INFO: Found 7129 unclassified reads from Centrifuge results [in target_classified_reads.py:65]\n    2019-04-16 13:40:34,242 INFO: Found 231 unique viral Taxonomy IDs [in target_classified_reads.py:98]\n    2019-04-16 13:40:34,245 INFO: Found 2181 target reads from centrifuge results [in target_classified_reads.py:101]\n    2019-04-16 13:40:34,245 INFO: Parsing kraken2 results into DataFrame [in target_classified_reads.py:49]\n    2019-04-16 13:40:34,289 INFO: Parsed n=20000 kraken2 result records into DataFrame from \"tests/data/SRR8207674-kraken2_results.tsv\" [in target_classified_reads.py:57]\n    2019-04-16 13:40:34,293 INFO: Parsed n=139 kraken2 Kraken-style report records into DataFrame from \"tests/data/SRR8207674-kraken2_report.tsv\" [in target_classified_reads.py:60]\n    2019-04-16 13:40:34,295 INFO: Found 1737 unclassified reads from Centrifuge results [in target_classified_reads.py:65]\n    2019-04-16 13:40:34,325 INFO: Found 26 unique viral Taxonomy IDs [in target_classified_reads.py:98]\n    2019-04-16 13:40:34,331 INFO: Found 8345 target reads from kraken2 results [in target_classified_reads.py:101]\n    2019-04-16 13:40:34,332 INFO: Found N=1701 common unclassified reads by all classification methods. [in cli.py:110]\n    2019-04-16 13:40:34,333 INFO: Total viral reads=8357 [in util.py:37]\n    2019-04-16 13:40:34,333 INFO: Centrifuge found n=12 target reads not found with Kraken2 [in util.py:38]\n    2019-04-16 13:40:34,333 INFO: Kraken2 found n=6176 target reads not found with Centrifuge [in util.py:40]\n    2019-04-16 13:40:34,338 INFO: N=1701 reads unclassified by both Centrifuge and Kraken2. [in util.py:62]\n    2019-04-16 13:40:34,345 INFO: Writing n=9999 filtered reads from \"tests/data/SRR8207674_1.viral_unclassified.seqtk_seed42_n10000.fastq.gz\" to \"r1.fq.gz\" [in cli.py:129]\n    2019-04-16 13:40:34,957 INFO: Writing n=9999 filtered reads from \"tests/data/SRR8207674_2.viral_unclassified.seqtk_seed42_n10000.fastq.gz\" to \"r2.fq.gz\" [in cli.py:134]\n    2019-04-16 13:40:35,459 INFO: Done! [in cli.py:137]\n\n\n\nCredits\n-------\n\nThis package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.\n\n.. _Cookiecutter: https://github.com/audreyr/cookiecutter\n.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage\n.. _Kraken2: https://ccb.jhu.edu/software/kraken2/\n.. _Centrifuge: https://ccb.jhu.edu/software/centrifuge/manual.shtml\n.. _seqtk: https://github.com/lh3/seqtk\n.. _pbgzip: https://anaconda.org/bioconda/pbgzip\n.. _Conda: https://conda.io/en/latest/\n\n\n=======\nHistory\n=======\n\n0.2.0 (2019-09-23)\n------------------\n\n* Use ``seqtk subseq`` instead of screed for pulling reads of interest from files\n* Added external dependencies for ``seqtk`` for pulling reads from input FASTQs and ``pbgzip`` for parallel block Gzip for output of Gzipped FASTQs\n* Removed ``screed`` Python dependency\n\n\n0.1.0 (2019-04-15)\n------------------\n\n* First release on PyPI.",
        "description_content_type": "",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/peterk87/filter_classified_reads",
        "keywords": "filter_classified_reads",
        "license": "Apache Software License 2.0",
        "maintainer": "",
        "maintainer_email": "",
        "name": "filter-classified-reads",
        "package_url": "https://pypi.org/project/filter-classified-reads/",
        "platform": "",
        "project_url": "https://pypi.org/project/filter-classified-reads/",
        "project_urls": {
            "Homepage": "https://github.com/peterk87/filter_classified_reads"
        },
        "release_url": "https://pypi.org/project/filter-classified-reads/0.2.0/",
        "requires_dist": null,
        "requires_python": "",
        "summary": "Filter for reads from taxa of interest using Kraken2/Centrifuge classification results",
        "version": "0.2.0"
    },
    "last_serial": 5890715,
    "releases": {
        "0.1.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "2701bb6e102a0c0ca71e0207d6acba4d",
                    "sha256": "99db66d8db04547ce97a57e34dfa9b8657730ad9902546e2bb8a56991c406631"
                },
                "downloads": -1,
                "filename": "filter_classified_reads-0.1.0.tar.gz",
                "has_sig": false,
                "md5_digest": "2701bb6e102a0c0ca71e0207d6acba4d",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 1616818,
                "upload_time": "2019-04-17T16:11:45",
                "url": "https://files.pythonhosted.org/packages/95/77/02c5752eaa39b02ed2e06d0d9775c7c83956a062f7c6859e50bb17c10451/filter_classified_reads-0.1.0.tar.gz"
            }
        ],
        "0.2.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "874906de011be0dc8a34cda7ae89df12",
                    "sha256": "34d386ac5ac8d53c24dad7e16056762f2ea6090505ea42d8e81a9e2af203519c"
                },
                "downloads": -1,
                "filename": "filter_classified_reads-0.2.0.tar.gz",
                "has_sig": false,
                "md5_digest": "874906de011be0dc8a34cda7ae89df12",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 1614289,
                "upload_time": "2019-09-26T13:47:23",
                "url": "https://files.pythonhosted.org/packages/4f/ff/d708158d9a8d07c0b7fb0481e7d17643570a8836b2807ca4efe6639c1c8b/filter_classified_reads-0.2.0.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "874906de011be0dc8a34cda7ae89df12",
                "sha256": "34d386ac5ac8d53c24dad7e16056762f2ea6090505ea42d8e81a9e2af203519c"
            },
            "downloads": -1,
            "filename": "filter_classified_reads-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "874906de011be0dc8a34cda7ae89df12",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 1614289,
            "upload_time": "2019-09-26T13:47:23",
            "url": "https://files.pythonhosted.org/packages/4f/ff/d708158d9a8d07c0b7fb0481e7d17643570a8836b2807ca4efe6639c1c8b/filter_classified_reads-0.2.0.tar.gz"
        }
    ]
}