{ "info": { "author": "Colby Chiang (colbychiang@wustl.edu)", "author_email": "colbychiang@wustl.edu", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 2.7", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "SVTyper\n=======\n[![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/hall-lab/svtyper/master/LICENSE)\n[![Build Status](https://travis-ci.org/hall-lab/svtyper.svg?branch=master)](https://travis-ci.org/hall-lab/svtyper)\n\nBayesian genotyper for structural variants\n\n## Overview\n\nSVTyper performs breakpoint genotyping of structural variants (SVs) using whole genome sequencing data. Users must supply a VCF file of sites to genotype (which may be generated by [LUMPY](https://github.com/arq5x/lumpy-sv)) as well as a BAM/CRAM file of Illumina paired-end reads aligned with [BWA-MEM](https://github.com/lh3/bwa). SVTyper assesses discordant and concordant reads from paired-end and split-read alignments to infer genotypes at each site. Algorithm details and benchmarking are described in [Chiang et al., 2015](http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3505.html).\n\n![NA12878 heterozygous deletion](etc/het.png?raw=true \"NA12878 heterozygous deletion\")\n\n## Installation\n\nRequirements:\n- Python 2.7.x\n\n### Install via `pip`\n\n pip install git+https://github.com/hall-lab/svtyper.git\n\n`svtyper` depends on [pysam][0] _(version 0.15.0 or newer)_, [numpy][1], and [scipy][2]; `svtyper-sso` additionally depends on [cytoolz][7]. If the dependencies aren't already available on your system, `pip` will attempt to download and install them.\n\n## `svtyper` vs `svtyper-sso`\n\n`svtyper` is the original implementation of the genotyping algorithm, and works with multiple samples. `svtyper-sso` is an alternative implementation of `svtyper` that is optimized for genotyping a single sample. `svtyper-sso` is a parallelized implementation of `svtyper` that takes advantage of multiple CPU cores via the [multiprocessing][8] module. `svtyper-sso` can offer a 2x or more speedup (depending on how many CPU cores used) in genotyping a single sample. **_NOTE: svtyper-sso is not yet stable. There are minor logging differences between the two and svtyper-sso may exit with an error prematurely when processing CRAM files._**\n\n## Example Usage\n\n### `svtyper`\n\n#### As a Command Line Python Script\n\n```bash\nsvtyper \\\n -i sv.vcf \\\n -B sample.bam \\\n -l sample.bam.json \\\n > sv.gt.vcf\n```\n\n#### As a Python Library\n\n```python\nimport svtyper.classic as svt\n\ninput_vcf = \"/path/to/input.vcf\"\ninput_bam = \"/path/to/input.bam\"\nlibrary_info = \"/path/to/library_info.json\"\noutput_vcf = \"/path/to/output.vcf\"\n\nwith open(input_vcf, \"r\") as inf, open(output_vcf, \"w\") as outf:\n svt.sv_genotype(bam_string=input_bam,\n vcf_in=inf,\n vcf_out=outf,\n min_aligned=20,\n split_weight=1,\n disc_weight=1,\n num_samp=1000000,\n lib_info_path=library_info,\n debug=False,\n alignment_outpath=None,\n ref_fasta=None,\n sum_quals=False,\n max_reads=None)\n\n# Results will be inside the /path/to/output.vcf file\n```\n\n### `svtyper-sso`\n\n#### As a Command Line Python Script\n\n```bash\nsvtyper-sso \\\n --core 2 # number of cpu cores to use \\\n --batch_size 1000 # number of SVs to process in a single batch (default: 1000) \\\n --max_reads 1000 # skip genotyping if SV contains valid reads greater than this threshold (default: 1000) \\\n -i sv.vcf \\\n -B sample.bam \\\n -l sample.bam.json \\\n > sv.gt.vcf\n```\n\n#### As a Python Library\n\n```python\nimport svtyper.singlesample as sso\n\ninput_vcf = \"/path/to/input.vcf\"\ninput_bam = \"/path/to/input.bam\"\nlibrary_info = \"/path/to/library_info.json\"\noutput_vcf = \"/path/to/output.vcf\"\n\nwith open(input_vcf, \"r\") as inf, open(output_vcf, \"w\") as outf:\n sso.sso_genotype(bam_string=input_bam,\n vcf_in=inf,\n vcf_out=outf,\n min_aligned=20,\n split_weight=1,\n disc_weight=1,\n num_samp=1000000,\n lib_info_path=library_info,\n debug=False,\n alignment_outpath=None,\n ref_fasta=None,\n sum_quals=False,\n max_reads=1000,\n cores=2,\n batch_size=1000)\n\n# Results will be inside the /path/to/output.vcf file\n```\n\n## Development\n\nRequirements:\n- Python 2.7 or newer\n- GNU Make\n- [virtualenv][3] _(or [conda][4] for [anaconda][5] or [miniconda][6] users)_\n\n### Setting Up a Development Environment\n\n#### Using `virtualenv`\n\n git clone https://github.com/hall-lab/svtyper.git\n cd svtyper\n virtualenv myvenv\n source myvenv/bin/activate\n pip install -e .\n \n make test\n\n # when you're finished with development\n git push \n deactivate\n cd .. && rm -rf svtyper\n\n#### Using `conda`\n\n git clone https://github.com/hall-lab/svtyper.git\n cd svtyper\n conda create --channel bioconda --name mycenv pysam numpy scipy cytoolz # type 'y' when prompted with \"proceed ([y]/n)?\"\n source activate mycenv\n pip install -e .\n \n make test\n\n\n # when you're finished with development\n git push \n source deactivate\n cd .. && rm -rf svtyper\n conda remove --name mycenv --all\n\n## Troubleshooting\n\nMany common issues are related to abnormal insert size distributions in the BAM file. SVTyper provides methods to assess and visualize the characteristics of sequencing libraries.\n\nRunning SVTyper with the `-l` flag creates a JSON file with essential metrics on a BAM file. SVTyper will sample the first N reads for the file (1 million by default) to parse the libraries, read groups, and insert size histograms. This can be done in the absence of a VCF file.\n```\nsvtyper \\\n -B my.bam \\\n -l my.bam.json\n```\n\nThe [lib_stats.R](scripts/lib_stats.R) script produces insert size histograms from the JSON file\n```\nscripts/lib_stats.R my.bam.json my.bam.json.pdf\n```\n![Insert size histogram](etc/my.bam.json.png?raw=true \"Insert size histogram\")\n\n\n## Citation\n\nC Chiang, R M Layer, G G Faust, M R Lindberg, D B Rose, E P Garrison, G T Marth, A R Quinlan, and I M Hall. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Meth 12, 966\u2013968 (2015). doi:10.1038/nmeth.3505.\n\nhttp://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3505.html\n\n[0]: https://github.com/pysam-developers/pysam\n[1]: http://www.numpy.org/\n[2]: https://www.scipy.org/\n[3]: https://github.com/pypa/virtualenv\n[4]: https://conda.io/docs/index.html\n[5]: https://docs.continuum.io/anaconda/\n[6]: https://conda.io/miniconda.html\n[7]: https://github.com/pytoolz/cytoolz\n[8]: https://docs.python.org/2/library/multiprocessing.html\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/hall-lab/svtyper", "keywords": "", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "svtyper", "package_url": "https://pypi.org/project/svtyper/", "platform": "", "project_url": "https://pypi.org/project/svtyper/", "project_urls": { "Homepage": "https://github.com/hall-lab/svtyper" }, "release_url": "https://pypi.org/project/svtyper/0.7.1/", "requires_dist": [ "pysam (>=0.15.0)", "numpy", "scipy", "cytoolz (>=0.8.2)" ], "requires_python": "", "summary": "Bayesian genotyper for structural variants", "version": "0.7.1" }, "last_serial": 5805576, "releases": { "0.5.0": [ { "comment_text": "", "digests": { "md5": "266de6ae6576adc0886e1fac8d215ae8", "sha256": "6914e7ef146d32c3a7aef0edccb164d58a2c45753f8f619de118fb5793449b87" }, "downloads": -1, "filename": "svtyper-0.5.0-py2-none-any.whl", "has_sig": false, "md5_digest": "266de6ae6576adc0886e1fac8d215ae8", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 62766, "upload_time": "2018-01-19T22:15:25", "url": "https://files.pythonhosted.org/packages/5e/03/69ccdb6042f2527848e8c48b196da2b78f2f1dfc4d4dbe31d29910e172a8/svtyper-0.5.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4d8e42bf66b4f74b06d7329c2e29c978", "sha256": "2c9b8b16747f4cadc2edd517530b366e5b642c1fc025a64d958a377f85171f92" }, "downloads": -1, "filename": "svtyper-0.5.0.tar.gz", "has_sig": false, "md5_digest": "4d8e42bf66b4f74b06d7329c2e29c978", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 46045, "upload_time": "2018-01-19T22:15:26", "url": "https://files.pythonhosted.org/packages/1e/4c/ae913a1b85e01566e735a2f83630d81beea5a2c8dedd0c148890bf91f5c6/svtyper-0.5.0.tar.gz" } ], "0.5.2": [ { "comment_text": "", "digests": { "md5": "6112f452d424dd32e4c4632c5b0ad0e7", "sha256": "c8d1e5565bf5a8a23ea1abe4619616581767259fb79e8450a50cc32d0a8968dd" }, "downloads": -1, "filename": "svtyper-0.5.2-py2-none-any.whl", "has_sig": false, "md5_digest": "6112f452d424dd32e4c4632c5b0ad0e7", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 62899, "upload_time": "2018-02-28T04:28:30", "url": "https://files.pythonhosted.org/packages/e1/41/c8222e7bf06c008be81c01cc047cf3ecabca6e1332690bd0907108dac7a4/svtyper-0.5.2-py2-none-any.whl" } ], "0.7.0": [ { "comment_text": "", "digests": { "md5": "e4e5314b5728a0e76e5a22ff914f10d5", "sha256": "1a6cc85af4e0c3d5032370bd251034e4db648edb523d594d0fa64b0dfe35e5f8" }, "downloads": -1, "filename": "svtyper-0.7.0-py2-none-any.whl", "has_sig": false, "md5_digest": "e4e5314b5728a0e76e5a22ff914f10d5", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 59306, "upload_time": "2018-09-13T15:40:15", "url": "https://files.pythonhosted.org/packages/9b/b3/5eb0bb61abbfcac1bd18816f841756995b9d578ace7d4071cad28f5c7e24/svtyper-0.7.0-py2-none-any.whl" } ], "0.7.1": [ { "comment_text": "", "digests": { "md5": "655bcd7480d89facb8a1521d74ca8247", "sha256": "49507598a7fbf42f080de7630485aac8f9375c5b06e9a46782e65f8a44990c93" }, "downloads": -1, "filename": "svtyper-0.7.1-py2-none-any.whl", "has_sig": false, "md5_digest": "655bcd7480d89facb8a1521d74ca8247", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 59453, "upload_time": "2019-09-04T20:32:19", "url": "https://files.pythonhosted.org/packages/b4/07/ececf3b0dbd47c423cd5059e364405146560a93fdd228f8a1feee58ab347/svtyper-0.7.1-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9141a384284cdc466fa034b1378f75b9", "sha256": "d0aa38eb3005210c51d3457c6f30ec4847e39dfcf9df237b464e1f9260a13ebb" }, "downloads": -1, "filename": "svtyper-0.7.1.tar.gz", "has_sig": false, "md5_digest": "9141a384284cdc466fa034b1378f75b9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 46974, "upload_time": "2019-09-04T20:32:21", "url": "https://files.pythonhosted.org/packages/7b/c0/35d8b6162abdf0439247481195a8eba4a3c759f5798d7a017934548823e1/svtyper-0.7.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "655bcd7480d89facb8a1521d74ca8247", "sha256": "49507598a7fbf42f080de7630485aac8f9375c5b06e9a46782e65f8a44990c93" }, "downloads": -1, "filename": "svtyper-0.7.1-py2-none-any.whl", "has_sig": false, "md5_digest": "655bcd7480d89facb8a1521d74ca8247", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 59453, "upload_time": "2019-09-04T20:32:19", "url": "https://files.pythonhosted.org/packages/b4/07/ececf3b0dbd47c423cd5059e364405146560a93fdd228f8a1feee58ab347/svtyper-0.7.1-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9141a384284cdc466fa034b1378f75b9", "sha256": "d0aa38eb3005210c51d3457c6f30ec4847e39dfcf9df237b464e1f9260a13ebb" }, "downloads": -1, "filename": "svtyper-0.7.1.tar.gz", "has_sig": false, "md5_digest": "9141a384284cdc466fa034b1378f75b9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 46974, "upload_time": "2019-09-04T20:32:21", "url": "https://files.pythonhosted.org/packages/7b/c0/35d8b6162abdf0439247481195a8eba4a3c759f5798d7a017934548823e1/svtyper-0.7.1.tar.gz" } ] }