{ "info": { "author": "Vlad Saveliev and Alla Mikheenko", "author_email": "vladislav.sav@gmail.com", "bugtrack_url": null, "classifiers": [ "Environment :: Console", "Environment :: Web Environment", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Natural Language :: English", "Operating System :: MacOS :: MacOS X", "Operating System :: POSIX", "Operating System :: Unix", "Programming Language :: JavaScript", "Programming Language :: Python", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "[![Anaconda-Server Badge](https://anaconda.org/vladsaveliev/targqc/badges/installer/conda.svg)](https://conda.anaconda.org/vladsaveliev)\n[![Build Status](https://travis-ci.org/vladsaveliev/TargQC.svg?branch=master)](https://travis-ci.org/vladsaveliev/TargQC)\n\n# TargQC - target capture coverage QC tool\n\n## Input\n- BAM file(s) (or FastQ files).\n- BED file (optional).\n\n## Output\n- `summary.html` \u2013 sample-level coverage statistics and plots.\n- `summary.tsv` \u2013 sample-level coverage, parsable version\n- `regions.txt` \u2013 region-level coverage statistics.\n\n## Installation\n### From bioconda\n```\nconda install -c bioconda targqc\n```\n\n### From PyPI\n```\npip install targqc\n```\n\n### From source\n```\ngit clone --recursive https://github.com/vladsaveliev/TargQC\ncd TargQC\nvirtualenv venv_targqc && source venv_targqc/bin/activate # optional, but recommended if you are not an admin\npip install --upgrade pip setuptools\npip install -r requirements.txt\npython setup.py install\n```\n\n## Usage\n```\ntargqc *.bam --bed target.bed -g hg19 -o targqc_results\n```\nThe results will be written to `targqc_results` folder.\n\nThe BED file may be omitted. In this case statistics reported will be based of off the whole genome.\n\nThe accepted values for `-g` are `hg19`, `hg38`, or a full path to any indexed reference fasta file:\n```\ntargqc *.bam --bed target.bed -g /path/to/genomes/some_genome.fa -o targqc_results\n```\nWhen running from BAMs, only the `.fai` index is used, and the fasta file itself can be non-existent.\n\nInstead of the BAM files, input FastQ are also allowed. The reads will be aligned by BWA to the reference \ngenome specified by `--bwa-prefix` (unless `-g` is already a fasta path bwa-indexed).\n```\ntargqc *.fastq --bed target.bed -g hg19 -o targqc_results --bwa-prefix /path/to/ref.bwa\n```\nOption `--downsample-to ` (default value `5e5`) specifies the number of \nread pairs will be randomly selected from each input set. This feature allows to quickly estimate approximate \ncoverage quality before full alignment. To turn downsampling off and align all reads, set `--downsample-to off`.\n\n\n## Parallel running\n### Threads\nRun using 3 threads:\n```\ntargqc *.bam --bed target.bed -g hg19 -o targqc_results -t 3\n```\n### Cluster\nRun using 3 jobs, using SGE scheduler, and queue \"queue\":\n```\ntargqc *.bam --bed target.bed -g hg19 -o targqc_results -t 3 -s sge -q batch.q -r pename=smp\n```\nIf the number of samples is higher than the requested number of jobs, the processes within job will be additionally parallelized using threads, so the full number of occupied cores will equal the number of requested threads (-t)\n\nOther supported schedulers: Platform LSF (\"lsf\"), Sun Grid Engine (\"sge\"), Torque (\"torque\"), SLURM (\"slurm\") (see details at https://github.com/roryk/ipython-cluster-helper)\n\n\n# BED file annotation\n\nThe `bed_annotation` package provides tools for annotation of BED file regions with gene symbols, based on Ensembl data.\n\n### Usage\n```\nannotate_bed.py INPUT.bed -g hg19 -o OUTPUT.bed\n``` \n\nScript checks each region against the Ensembl genomic features database, and writes a BED file in a standardized format with a gene symbol, strand and exon rank in 4-6th columns:\n\n`INPUT.bed`:\n```\nchr1 69090 70008\nchr1 367658 368597\n```\n\n`OUTPUT.bed`:\n```\nchr1 69090 70008 OR4F5 1 +\nchr1 367658 368597 OR4F29 1 +\n```\n\n#### Transcripts order\n\nThe piority for choosing transcripts for annotation is the following:\n- Overlap % with transcript\n- Overlap % with CDS\n- Overlap % with exons\n- Biotype (`protein_coding` > others > `*RNA` > `*_decay` > `sense_*` > `antisense` > `translated_*` > `transcribed_*`)\n- TSL (1 > NA > others > 2 > 3 > 4 > 5)\n- Presence of a HUGO gene symbol\n- Is cancer canonical\n- Transcript size\n\n#### Extended annotation\n\nUse `--extended` option to report extra columns with details on features, biotype, overlapping transcripts and overlap sizes:\n```\nannotate_bed.py INPUT.bed -g hg19 -o OUTPUT.bed --extended\n```\n\n`OUTPUT.bed`:\n```\n## Tx_overlap_%: part of region overlapping with transcripts\n## Exon_overlaps_%: part of region overlapping with exons\n## CDS_overlaps_%: part of region overlapping with protein coding regions\n#Chrom Start End Gene Exon Strand Feature Biotype Ensembl_ID TSL HUGO Tx_overlap_% Exon_overlaps_% CDS_overlaps_%\nchr1 69090 70008 OR4F5 1 + capture protein_coding ENST00000335137 NA OR4F5 100.0 100.0 99.7\nchr1 367658 368597 OR4F29 1 + capture protein_coding ENST00000426406 NA OR4F29 100.0 100.0 99.7\n```\n\n#### Ambuguous annotations\n\nRegions may overlap mltiple genes. The `--ambiguities` controls how the script resolves such ambiguities\n- `--ambiguities all` -- report all reliable overlaps (in order in the \"priority\" section, see above)\n- `--ambiguities all_ask` -- stop execution and ask user which annotation to pick\n- `--ambiguities best_all` (default) -- find the best overlap, and if there are several equally good, report all (in terms of the \"priority\" above)\n- `--ambiguities best_ask` -- find the best overlap, and if there are several equally good, ask user\n- `--ambiguities best_one` -- find the best overlap, and if there are several equally good, report any of them\n\nNote that the first 4 options might output multiple lines per region, e.g.:\n```\nannotate_bed.py INPUT.bed -g hg19 -o OUTPUT.bed --extended --ambiguities best_all\n```\n`OUTPUT.bed`:\n```\n## Tx_overlap_%: part of region overlapping with transcripts\n## Exon_overlaps_%: part of region overlapping with exons\n## CDS_overlaps_%: part of region overlapping with protein coding regions\n#Chrom Start End Gene Exon Strand Feature Biotype Ensembl_ID TSL HUGO Tx_overlap_% Exon_overlaps_% CDS_overlaps_%\nchr1 69090 70008 OR4F5 1 + capture protein_coding ENST00000335137 NA OR4F5 100.0 100.0 100.0\nchr1 367658 368597 OR4F29 1 + capture protein_coding ENST00000426406 NA OR4F29 100.0 100.0 100.0\nchr1 367658 368597 OR4F29 1 + capture protein_coding ENST00000412321 NA OR4F29 100.0 100.0 100.0\n```\n\n# Venn diagrams for BED files\nBuild a web-page with size-proportional Venn diagrams for an unlimited set of BED files:\n```\nbed_venn.py *.bed -o res_dir\n```\n", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/vladsaveliev/TargQC/releases", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/vladsaveliev/TargQC", "keywords": "bioinformatics", "license": "GPLv3", "maintainer": "", "maintainer_email": "", "name": "targqc", "package_url": "https://pypi.org/project/targqc/", "platform": "", "project_url": "https://pypi.org/project/targqc/", "project_urls": { "Download": "https://github.com/vladsaveliev/TargQC/releases", "Homepage": "https://github.com/vladsaveliev/TargQC" }, "release_url": "https://pypi.org/project/targqc/1.5.2/", "requires_dist": null, "requires_python": "", "summary": "Genome capture target coverage evaluation tool", "version": "1.5.2" }, "last_serial": 2907629, "releases": { "0.0.0": [ { "comment_text": "", "digests": { "md5": "8c3491876f1dc1502128c4bcdd25cc47", "sha256": "c013e8e760e8ec45976703120da4bb2325b57bf58cd3ecf9c9219c909ebd0675" }, "downloads": -1, "filename": "TargQC-0.0.0.tar.gz", "has_sig": false, "md5_digest": "8c3491876f1dc1502128c4bcdd25cc47", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 44985, "upload_time": "2017-04-23T17:24:21", "url": "https://files.pythonhosted.org/packages/2a/a9/20c4268b5c9371ab042e15e3a40f38767b0a7098faf801ded5b39ab15608/TargQC-0.0.0.tar.gz" } ], "1.0.0a15": [], "1.1.0": [ { "comment_text": "", "digests": { "md5": "6c810f0e18f4e15605c9ce29485079cd", "sha256": "accc8725906c3d9593f407ea02a825a9665b59e95eddf7107198a4fb8597bf86" }, "downloads": -1, "filename": "targqc-1.1.0.tar.gz", "has_sig": false, "md5_digest": "6c810f0e18f4e15605c9ce29485079cd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 128933, "upload_time": "2016-08-11T17:43:17", "url": "https://files.pythonhosted.org/packages/a7/ea/0cf63e312c1b3a1735ae7b1e5dbb3f14a6d068eeb19d7dc5216cf32712c9/targqc-1.1.0.tar.gz" } ], "1.4.2": [ { "comment_text": "", "digests": { "md5": "75c1f181e081bc29ba9a74df7cc43da5", "sha256": "606658b8302d9370a3edbe8ec50767af12294d21fb96ca359603064dde53949e" }, "downloads": -1, "filename": "TargQC-1.4.2.tar.gz", "has_sig": false, "md5_digest": "75c1f181e081bc29ba9a74df7cc43da5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37251450, "upload_time": "2017-04-23T19:00:22", "url": "https://files.pythonhosted.org/packages/31/18/67a7da978bf51d04f43c821f12151c5ef201fc7f45b83af550310adc3643/TargQC-1.4.2.tar.gz" } ], "1.4.4": [ { "comment_text": "", "digests": { "md5": "75b056abdf2e2cf9860eaa8758ab236a", "sha256": "b17e1c94759f671392ad3f31ec1a98901c8fbf7268080db831c2aadac498cf8b" }, "downloads": -1, "filename": "TargQC-1.4.4.tar.gz", "has_sig": false, "md5_digest": "75b056abdf2e2cf9860eaa8758ab236a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37312594, "upload_time": "2017-04-23T23:56:15", "url": "https://files.pythonhosted.org/packages/e0/98/f95a514128e3ee1bfc60a4a9544df9a0497b691383bfebf991973e35c414/TargQC-1.4.4.tar.gz" } ], "1.5.2": [ { "comment_text": "", "digests": { "md5": "8f7bf32bc0b497b60129f1f11d324032", "sha256": "60a5579968cd3eecbd302a096085c172223b6d32cad515582636dcf21d804d1c" }, "downloads": -1, "filename": "TargQC-1.5.2.tar.gz", "has_sig": false, "md5_digest": "8f7bf32bc0b497b60129f1f11d324032", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 47061856, "upload_time": "2017-05-29T21:41:28", "url": "https://files.pythonhosted.org/packages/33/c7/baedfb30c1bf8835cacff8fc8244c079098a7b47f6a108cb9801a88bd7bf/TargQC-1.5.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8f7bf32bc0b497b60129f1f11d324032", "sha256": "60a5579968cd3eecbd302a096085c172223b6d32cad515582636dcf21d804d1c" }, "downloads": -1, "filename": "TargQC-1.5.2.tar.gz", "has_sig": false, "md5_digest": "8f7bf32bc0b497b60129f1f11d324032", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 47061856, "upload_time": "2017-05-29T21:41:28", "url": "https://files.pythonhosted.org/packages/33/c7/baedfb30c1bf8835cacff8fc8244c079098a7b47f6a108cb9801a88bd7bf/TargQC-1.5.2.tar.gz" } ] }