{ "info": { "author": "University of Michigan Bioinformatics Core", "author_email": "bfx-katana@umich.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: Console", "Intended Audience :: Science/Research", "License :: OSI Approved :: Apache Software License", "Operating System :: Unix", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "======\nKatana\n======\n\nCommand-line tool to soft-clip reads from amplicon-based sequence based on\nspecified primer locations.\n\n.. image:: https://travis-ci.org/umich-brcf-bioinf/Katana.svg?branch=develop\n :target: https://travis-ci.org/umich-brcf-bioinf/Katana\n :alt: Build Status\n\n.. image:: https://coveralls.io/repos/github/umich-brcf-bioinf/Katana/badge.svg?branch=develop\n :target: https://coveralls.io/github/umich-brcf-bioinf/Katana?branch=develop\n :alt: Coverage Status\n\nThe official repository is at:\n\nhttps://github.com/umich-brcf-bioinf/Katana\n\n--------\nOverview\n--------\n\nIn amplicon-based target panel sequencing, regions-of-interest are amplified by\nspecific pairs of primers; consequently the regions-of-interest typically\nalways start and end with these primer sequences, sequences which match the\nreference sequence exactly and do not reflect the actual sample sequence. In\nsome panel designs, the amplicons may be tiled such that an amplicon of one\nregion of interest may overlap the primer region of different amplicon. In this\narrangement, the overlapping regions should enable detection of variants that\nfall within that primer region. However, the presence of the primer sequences\nwill typically overwhelm the signature of true, low-frequency variants.\n\n\nKatana matches each read to its corresponding primer pair based on start\nposition of the read. Katana then soft-clips the primer region from the edge of\nthe read sequence, rescuing the signal of true variants measured by overlapping\namplicons. The output is conceptually similar to hard-clipping the primers from\nthe original FASTQ reads based on sequence identity but with the advantage that\nretaining the primers during alignment improves alignment quality.\n::\n amplicon A [ primerREGION-OF-INTERESTprimer ]\n amplicon B [ primerREGION-OF-INTERESTprimer ]\n input read1 sequence: TGCATGAGTCTGATCTAGGTAGTTGACGTC\n input read2 sequence: ATCTAGGTAGTTGACGTCAGATAATGCAGC\n\n output read1 sequence: tgcatgAGTCTGATCTAGGTAGTTgacgtc (clipped amplicon A primers)\n output read2 sequence: atctagGTAGTTGACGTCAGATAAtgcagc (clipped amplicon B primers)\n (lowercase = soft-clipped)\n\n\nTags are added to each output read to help explain how it was modified:\n - X0 : associated primer id\n - X1 : original cigar string\n - X2 : original reference start\n - X3 : original reference_end (informational; useful for reverse reads)\n - X4 : why read would be excluded (appears only if --preserve_all_alignments)\n\n\nKatana assumes that:\n - input bam is indexed\n - primers come in sense-antisense pairs\n - primer pairs are on the same chromosome\n - primer chromsomes match the bam regions\n - primer file is tab separated; the header line includes the following fields:\n * Customer TargetID\n * Chr\n * Sense Start\n * Antisense Start\n * Sense Sequence\n * Antisense Sequence\n - primer file sense and antisense start are specified in 1-based coordinates\n\n\n-----------\nQuick Start\n-----------\n\n1. **Install Katana (see INSTALL.rst):**\n::\n $ pip install katana\n\n2. **Get the examples directory:**\n::\n $ git clone https://github.com/umich-brcf-bioinf/Katana\n\n3. **Run Katana:**\n::\n $ katana Katana/examples/primers.txt Katana/examples/chr10.pten.bam clipped.bam\n\nThis will read chr10.pten.bam and produce clipped.bam which contains reads\nadjusted to soft-clip (exclude) their respective primer regions. Unmapped reads\nor reads which do not match a known primer are excluded.\n\n\n-----------\nKatana help\n-----------\n\n::\n\n $ katana --help\n\n usage: katana primer_manifest input_bam output_bam\n\n Match each alignment in input BAM to primer, softclipping the primer region.\n\n positional arguments:\n primer_manifest path to primer manifest (tab-separated text)\n input_bam path to input BAM\n output_bam path to output BAM\n\n\n optional arguments:\n -h, --help show this help message and exit\n -V, --version show program's version number and exit\n --preserve_all_alignments\n Preserve all incoming alignments (even if they are \n unmapped, cannot be matched with primers, result in \n invalid CIGARs, etc.)\n\n====\n\nEmail bfx-katana@umich.edu for support and questions.\n\nUM BRCF Bioinformatics Core\n\n\nChangelog\n=========\n\n0.1.2 (11/2/2017)\n-----------------\n - Adds/correctly updates MC tag\n - Fixed erroneous mate info when mate is filtered out\n\n - Correctly sets mate start pos to 0\n - Removes MC tag if present\n\n - Sanitizes BAM tag of primer names\n - Extended supported pysam versions to include 0.9-0.12 \n\n\n0.1.1 (2/9/2016)\n----------------\n - Fixed problems in BAM output:\n - Corrected next reference in paired reads\n - Excludes reads where CIGAR is entirely clipped\n - Unpairs reads which had no mate in input\n - Added BAM tags to excluded reads (useful when --preserving_all_reads)\n - Adjusted to improve performance (about 6x faster)\n - Added support for pip install\n - Added functional tests\n - Added support for travis CI\n - Added support for Python3\n - Added support for pysam 0.8.3\n\n0.1 (1/28/2016)\n---------------\n - Initial Release\n\n\nKatana is written and maintained by the University of Michigan \nBRCF Bioinformatic Core; individual contributors include:\n\n- Chris Gates\n- Peter Ulintz\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/umich-brcf-bioinf/Katana", "keywords": "bioinformatic exome-seq DNA-seq BAM", "license": "Apache", "maintainer": "", "maintainer_email": "", "name": "Katana", "package_url": "https://pypi.org/project/Katana/", "platform": "", "project_url": "https://pypi.org/project/Katana/", "project_urls": { "Homepage": "https://github.com/umich-brcf-bioinf/Katana" }, "release_url": "https://pypi.org/project/Katana/0.1.2/", "requires_dist": [ "natsort", "pysam" ], "requires_python": "", "summary": "Command-line tool to soft-clip reads based on primer locations.", "version": "0.1.2" }, "last_serial": 3300851, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "210b82d9b30ebc1cee3e9b7cbb51c447", "sha256": "3ce224b489a2d034146b051299b692d0daa5354bca994fb8baa30e80c142b06f" }, "downloads": -1, "filename": "Katana-0.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "210b82d9b30ebc1cee3e9b7cbb51c447", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 16957, "upload_time": "2016-02-10T21:09:32", "url": "https://files.pythonhosted.org/packages/f3/dc/5d1632d9c8ba91213f98b02c6a57dd71e9cfafe16a93cb622c3113861314/Katana-0.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "149d77e02a86c86ef3a822209c3e564d", "sha256": "1db3b0e7828312eec2154aaa627184e0183a639510c00998ffdfdd26978d2166" }, "downloads": -1, "filename": "Katana-0.1.1.tar.gz", "has_sig": false, "md5_digest": "149d77e02a86c86ef3a822209c3e564d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12019, "upload_time": "2016-02-10T02:29:45", "url": "https://files.pythonhosted.org/packages/35/c2/4ca4c60a013d74a07957dc0a35d326594065618c39f833a3e7ad1352dc9a/Katana-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "4bbcce624231e862d4f02a3da4c3aa59", "sha256": "eb71679c4a937a8be35f71332bb2634e41dcd22077cc3201557d9a6b49ba1515" }, "downloads": -1, "filename": "Katana-0.1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "4bbcce624231e862d4f02a3da4c3aa59", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 18055, "upload_time": "2017-11-02T17:18:58", "url": "https://files.pythonhosted.org/packages/b6/1e/f28db023c35b15cd7b18ad6df90ae06994c333acf53800497324683b479b/Katana-0.1.2-py2.py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4bbcce624231e862d4f02a3da4c3aa59", "sha256": "eb71679c4a937a8be35f71332bb2634e41dcd22077cc3201557d9a6b49ba1515" }, "downloads": -1, "filename": "Katana-0.1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "4bbcce624231e862d4f02a3da4c3aa59", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 18055, "upload_time": "2017-11-02T17:18:58", "url": "https://files.pythonhosted.org/packages/b6/1e/f28db023c35b15cd7b18ad6df90ae06994c333acf53800497324683b479b/Katana-0.1.2-py2.py3-none-any.whl" } ] }