{ "info": { "author": "ACEnglish", "author_email": "acenglish@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "```\n\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2557\u2588\u2588\u2588\u2588\u2588\u2588\u2557 \u2588\u2588\u2557 \u2588\u2588\u2557\u2588\u2588\u2557 \u2588\u2588\u2557 \u2588\u2588\u2588\u2588\u2588\u2557 \u2588\u2588\u2588\u2588\u2588\u2588\u2557 \u2588\u2588\u2557\n\u255a\u2550\u2550\u2588\u2588\u2554\u2550\u2550\u255d\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557\u2588\u2588\u2551 \u2588\u2588\u2551\u2588\u2588\u2551 \u2588\u2588\u2551\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557\u2588\u2588\u2551\n \u2588\u2588\u2551 \u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d\u2588\u2588\u2551 \u2588\u2588\u2551\u2588\u2588\u2551 \u2588\u2588\u2551\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2551\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d\u2588\u2588\u2551\n \u2588\u2588\u2551 \u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557\u2588\u2588\u2551 \u2588\u2588\u2551\u255a\u2588\u2588\u2557 \u2588\u2588\u2554\u255d\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2551\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557\u2588\u2588\u2551\n \u2588\u2588\u2551 \u2588\u2588\u2551 \u2588\u2588\u2551\u255a\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d \u255a\u2588\u2588\u2588\u2588\u2554\u255d \u2588\u2588\u2551 \u2588\u2588\u2551\u2588\u2588\u2551 \u2588\u2588\u2551\u2588\u2588\u2551\n \u255a\u2550\u255d \u255a\u2550\u255d \u255a\u2550\u255d \u255a\u2550\u2550\u2550\u2550\u2550\u255d \u255a\u2550\u2550\u2550\u255d \u255a\u2550\u255d \u255a\u2550\u255d\u255a\u2550\u255d \u255a\u2550\u255d\u255a\u2550\u255d\n```\n\nStructural variant comparison tool for VCFs\n\nGiven benchmark and comparsion sets of SVs, calculate the recall, precision, and f-measure.\n\n[Spiral Genetics](https:www.spiralgenetics.com)\n\n[Motivation](https://docs.google.com/presentation/d/17mvC1XOpOm7khAbZwF3SgtG2Rl4M9Mro37yF2nN7GhE/edit)\n\nUPDATES\n=======\n\nTruvari has some big changes. In order to keep up with the retirement of Python 2.7 https://pythonclock.org/\nWe're now only supporting Python 3.\n\nAdditionally, we now package Truvari so it and its dependencies can be installed directly. See Installation \nbelow. This will enable us to refactor the code for easier maintenance and reusability.\n\nFinally, we now automatically report genotype comparisons in the summary stats.\n\nInstallation\n============\n\nTruvari uses Python 3.7 and can be installed with pip:\n\n $ pip install Truvari \n\n\nQuick start\n===========\n\n $ truvari -b base_calls.vcf -c compare_calls.vcf -o output_dir/\n\nOutputs\n=======\n\n * tp-call.vcf -- annotated true positive calls from the COMP\n * tp-base.vcf -- anotated true positive calls form the BASE\n * fn.vcf -- false negative calls from BASE\n * fp.vcf -- false positive calls from COMP\n * base-filter.vcf -- size filtered calls from BASE\n * call-filter.vcf -- size filtered calls from COMP\n * summary.txt -- json output of performance stats\n * log.txt -- run log\n * giab_report.txt -- (optional) Summary of GIAB benchmark calls. See \"Using the GIAB Report\" below.\n\nsummary.txt\n===========\n\nThe following stats are generated for benchmarking your call set.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
MetricDefinition
TP-baseNumber of matching calls from the base vcf
TP-callNumber of matching calls from the comp vcf
FPNumber of non-matching calls from the comp vcf
FNNumber of non-matching calls from the base vcf
precisionTP-call / (TP-call + FP)
recallTP-base / (TP-base + FN)
f1(recall * precision) / (recall + precision)
base cntNumber of calls in the base vcf
call cntNumber of calls in the comp vcf
base size filteredNumber of base vcf calls outside of (sizemin, sizemax)
call size filteredNumber of comp vcf calls outside of (sizemin, sizemax)
base gt filteredNumber of base calls not passing the no-ref parameter filter
call gt filteredNumber of comp calls not passing the no-ref parameter filter
TP-call_TP-gtTP-call's with genotype match
TP-call_FP-gtTP-call's without genotype match
TP-base_TP-gtTP-base's with genotype match
TP-base_FP-gtTP-base's without genotype match
gt_precisionTP-call_TP-gt / (TP-call_TP-gt + FP + TP-call_FP-gt)
gt_recallTP-base_TP-gt / (TP-base_TP-gt / FN)
gt_f1(gt_recall * gt_precision) / (gt_recall + gt_precision)
\n\nMethodology\n===========\n\n```\nInput:\n BaseCall - Benchmark TruthSet of SVs\n CompCalls - Comparison SVs from another program\nBuild IntervalTree of CompCalls\nFor each BaseCall:\n Fetch CompCalls overlapping within *refdist*. \n If typematch and LevDistRatio >= *pctsim* \\\n and SizeRatio >= *pctsize* and PctRecOvl >= *pctovl*: \n Add CompCall to list of Neighbors\n Sort list of Neighbors by TruScore ((2*sim + 1*size + 1*ovl) / 3.0)\n Take CompCall with highest TruScore and BaseCall as TPs\n Only use a CompCall once if not --multimatch\n If no neighbors: BaseCall is FN\nFor each CompCall:\n If not used: mark as FP\n```\n\nMatching Parameters\n--------------------\n\n\n\n\n\n\n\n\n\n\n
ParameterDefaultDefinition
refdist500Maximum distance comparison calls must be within from base call's start/end
pctsim0.7Levenshtein distance ratio between the REF/ALT haplotype sequences of base and comparison call.\nSee \"Comparing Haplotype Sequences of Variants\" below.\n
pctsize0.7Ratio of min(base_size, comp_size)/max(base_size, comp_size)
pctovl0.0Ratio of two calls' (overlapping bases)/(longest span)
typeignoreFalseTypes don't need to match to compare calls.
\n\nComparing VCFs without sequence resolved calls\n----------------------------------------------\n\nIf the base or comp vcfs do not have sequence resolved calls, simply set `--pctsim=0` to turn off\nsequence comparison.\n\nDifference between --sizemin and --sizefilt\n-------------------------------------------\n\n`--sizemin` is the minimum size of a base call or comparison call to be considered. \n\n`--sizefilt` is the minimum size of a call to be added into the IntervalTree for searching. It should\nbe less than `sizemin` for edge case variants.\n\nFor example: `sizemin` is 50 and `sizefilt` is 30. A 50bp base call is 98% similar to a 49bp call at \nthe same position.\n\nThese two calls should be considered matching. If we instead removed calls less than `sizemin`, we'd\nincorrectly classify the 50bp base call as a false negative.\n\nThis does have the side effect of artificially inflating specificity. If that same 49bp call in the\nabove were below the similarity threshold, it would not be classified as a FP due to the `sizemin`\nthreshold. So we're giving the call a better chance to be useful and less chance to be detrimental\nto final statistics.\n\nDefinition of annotations added to TP vcfs\n--------------------------------------------\n\n\n\n\n\n\n\n\n\n\n\n
Anno Definition
TruScore\t Truvari score for similarity of match. `((2*sim + 1*size + 1*ovl) / 3.0)`
PctSeqSimilarity Pct sequence similarity between this variant and its closest match
PctSizeSimilarity Pct size similarity between this variant and it's closest match
PctRecOverlap Percent reciprocal overlap of the two calls' coordinates
StartDistance Distance of this call's start from matching call's start
EndDistance Distance of this call's end from matching call's end
SizeDiff Difference in size(basecall) and size(compcall)
NumNeighbors Number of comparison calls that were in the neighborhood (REFDIST) of the base call
NumThresholdNeighbors Number of comparison calls that passed threshold matching of the base call
\n\nNumNeighbors and NumThresholdNeighbors are also added to the FN vcf.\n\nUsing the GIAB Report\n---------------------\n\nWhen running against the GIAB SV benchmark (link below), you can create a detailed report of \ncalls summarized by the GIAB VCF's SVTYPE, SVLEN, Technology, and Repeat annotations.\n\nTo create this report.\n\n1. Run truvari with the flag `--giabreport`.\n2. In your output directory, you will find a file named `giab_report.txt`.\n3. Next, make a copy of the \n[Truvari Report Template Google Sheet](https://docs.google.com/spreadsheets/d/1T3EdpyLO1Kq-bJ8SDatqJ5nP_wwFKCrH0qhxorvTVd4/edit?usp=sharing).\n4. Finally, paste ALL of the information inside `giab_report.txt` into the \"RawData\" tab. Be careful not \nto alter the report text in any way. If successul, the \"Formatted\" tab you will have a fully formated report.\n\nWhile Truvari can use other benchmark sets, this formatted report currently only works with GIAB SV v0.5 and v0.6. Work\nwill need to be done to ensure Truvari can parse future GIAB SV releases.\n\nGIAB v0.6 Download Link\n\nInclude Bed & VCF Header Contigs \n--------------------------------\n\nIf an `--includebed` is provided, only base and comp calls contained within the defined regions are used \nfor comparison. This is similar to pre-filtering your base/comp calls using:\n\n```bash\n(zgrep \"#\" my_calls.vcf.gz && bedtools intersect -u -a my_calls.vcf.gz -b include.bed) | bgzip > filtered.vcf.gz\n```\n\nwith the exception that Truvari requires the start and the end to be contained in the same includebed region \nwhereas `bedtools intersect` does not.\n\nIf an `--includebed` is not provided, the comparison is restricted to only the contigs present in the base VCF\nheader. Therefore, any comparison calls on contigs not in the base calls will not be counted toward summary \nstatistics and will not be present in any output vcfs.\n\n\nComparing Haplotype Sequences of Variants\n---------------------------------------\n\nTo compare the sequence similarity, build the haplotypes over the range of min(call starts)-max(call ends) and\nbuild the sequence change from the variants. For example:\n\n``` python\nhap1_seq = ref.get_seq(a1_chrom, start + 1, a1_start).seq + a1_seq + ref.get_seq(a1_chrom, a1_end + 1, end).seq\n```\n\nWhere `a1_seq1` is the longer of the REF or ALT allele.\n\nMore Information\n----------------\n\nFind more details and discussions about Truvari on the [WIKI page](https://github.com/spiralgenetics/truvari/wiki).\n\n\n\n![Spiral Genetics](http://static1.squarespace.com/static/5a81ef7629f187c795c973c3/t/5a986ab453450a17fc3003e8/1533115866026/?format=1500w)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/spiralgenetics/truvari", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "Truvari", "package_url": "https://pypi.org/project/Truvari/", "platform": "", "project_url": "https://pypi.org/project/Truvari/", "project_urls": { "Homepage": "https://github.com/spiralgenetics/truvari" }, "release_url": "https://pypi.org/project/Truvari/1.3.2/", "requires_dist": [ "python-Levenshtein (==0.12.0)", "progressbar2 (==3.41.0)", "pysam (==0.15.2)", "pyfaidx (==0.5.5.2)", "intervaltree (==3.0.2)" ], "requires_python": "", "summary": "Structural variant comparison tool for VCFs", "version": "1.3.2" }, "last_serial": 5929496, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "d6ebc4580d9a69337cc8c655722f082d", "sha256": "456cab48b9f182e87d1a0197bfc631ea8d86e39654f681013b8d6faae2f048dc" }, "downloads": -1, "filename": "Truvari-1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "d6ebc4580d9a69337cc8c655722f082d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 23275, "upload_time": "2019-06-05T22:02:48", "url": "https://files.pythonhosted.org/packages/1d/32/53ebe68453d065bac9427d01492f34afa856467c5f0221448d429ecf2127/Truvari-1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6e247e3c76d9c45107db211e115193ae", "sha256": "a21d0bb083b1630dc8a8e6908901afee2a0439c5595ba117272234feebe4d8a7" }, "downloads": -1, "filename": "Truvari-1.0.tar.gz", "has_sig": false, "md5_digest": "6e247e3c76d9c45107db211e115193ae", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20944, "upload_time": "2019-06-05T22:02:50", "url": "https://files.pythonhosted.org/packages/6a/d1/f6063ffdc0016c07534d12dbb11d3b7dd3dbb38f276df036cd012cd40c2b/Truvari-1.0.tar.gz" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "3f58ae03166ae8f0c76331afdc8e8ad1", "sha256": "534955f051cb8b3fae7d7046c8807c539fcd15b4ac44ffa76732a9d1b6d36c06" }, "downloads": -1, "filename": "Truvari-1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "3f58ae03166ae8f0c76331afdc8e8ad1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 23281, "upload_time": "2019-06-05T22:09:43", "url": "https://files.pythonhosted.org/packages/e3/bf/5e81266724284f9c84577701f4f268a97987c931d91e21d9465e85f7728f/Truvari-1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d58dd4a7148bd6839dc7a606de0e8e3d", "sha256": "1b4dac06860e9f16714ac2074cc7a8bb8af48e80d82d2e3a6e84af4e09962ecf" }, "downloads": -1, "filename": "Truvari-1.1.tar.gz", "has_sig": false, "md5_digest": "d58dd4a7148bd6839dc7a606de0e8e3d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20948, "upload_time": "2019-06-05T22:09:45", "url": "https://files.pythonhosted.org/packages/9c/a1/5e915c8b8bb87d3f777c743199cb3e6413d0fd738eed9611fd1ba520f852/Truvari-1.1.tar.gz" } ], "1.2": [ { "comment_text": "", "digests": { "md5": "9ed03993da5530961b76cd35aebd91c4", "sha256": "b50328edfa0fafdb1faecfb8423ee89756f987facd15633f63a566561980b690" }, "downloads": -1, "filename": "Truvari-1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "9ed03993da5530961b76cd35aebd91c4", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 23315, "upload_time": "2019-07-15T22:53:16", "url": "https://files.pythonhosted.org/packages/ae/b0/d38c27288858619a557a7b8a29078502906f0294d8c227cf1a3f557e77e5/Truvari-1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "fe2f103abbbdd5189bab22322f667819", "sha256": "9cd34d44d64a3c6db163a3659a8abc0f43d0ca3071a616fc13949d47d5ced9cf" }, "downloads": -1, "filename": "Truvari-1.2.tar.gz", "has_sig": false, "md5_digest": "fe2f103abbbdd5189bab22322f667819", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24755, "upload_time": "2019-07-15T22:53:17", "url": "https://files.pythonhosted.org/packages/9d/89/d444243cc518686d9c5bf974e2846f1325cb289ade4fcab407fe0cf936c4/Truvari-1.2.tar.gz" } ], "1.3": [ { "comment_text": "", "digests": { "md5": "cf8d1e0fb447b0b930d4038862e3f3be", "sha256": "ec7803f8b4578caedc7f2c7c94ff07fd8fb1b398fc8cd3847b71556070d3eb61" }, "downloads": -1, "filename": "Truvari-1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "cf8d1e0fb447b0b930d4038862e3f3be", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 23609, "upload_time": "2019-09-26T06:04:02", "url": "https://files.pythonhosted.org/packages/03/d8/a703bd1189689d185d4bcd1bf27fc824d2bd5e9c234d9de367a00b301728/Truvari-1.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0e39ce8a073b4416094118f5bfb897e2", "sha256": "9bbf6b98c632b4de4f97791cfaca9af14541d773f29a3b847dfd5edda678dc25" }, "downloads": -1, "filename": "Truvari-1.3.tar.gz", "has_sig": false, "md5_digest": "0e39ce8a073b4416094118f5bfb897e2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21483, "upload_time": "2019-09-26T06:04:04", "url": "https://files.pythonhosted.org/packages/be/97/b8b8fcf4885828db1f632c57a7ee947445f155538692ad606a9273e870a0/Truvari-1.3.tar.gz" } ], "1.3.1": [ { "comment_text": "", "digests": { "md5": "993c28e758e5776694da2890a4bc6d3a", "sha256": "644a646edc3a78e0e0e1031fb9af9f9cf284a6752f01dd902f0f6da87c5905d6" }, "downloads": -1, "filename": "Truvari-1.3.1-py3-none-any.whl", "has_sig": false, "md5_digest": "993c28e758e5776694da2890a4bc6d3a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 23629, "upload_time": "2019-09-30T18:47:17", "url": "https://files.pythonhosted.org/packages/49/e7/13dd1acb2ab9d477779df77990dcae8f879f35812f12ad2e55476c82b2a8/Truvari-1.3.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9b3b9f9035d1398c9a876e99ff271a07", "sha256": "63b1b3b25de1058f77e4b4bb2ca94dd35790dbae54e6079b8f40d9e30a2998ab" }, "downloads": -1, "filename": "Truvari-1.3.1.tar.gz", "has_sig": false, "md5_digest": "9b3b9f9035d1398c9a876e99ff271a07", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21486, "upload_time": "2019-09-30T18:47:20", "url": "https://files.pythonhosted.org/packages/2f/11/6850dff4cbf7359e019502f9bce0c8c28f81537f5508bcd2f599802fb02b/Truvari-1.3.1.tar.gz" } ], "1.3.2": [ { "comment_text": "", "digests": { "md5": "1eba10e7688d9c627b2c573867c37f3e", "sha256": "fe3ccc4d94186c25ee2c5455fb364b73b9ef72db73162a118488ed1fdc2b01c0" }, "downloads": -1, "filename": "Truvari-1.3.2-py3-none-any.whl", "has_sig": false, "md5_digest": "1eba10e7688d9c627b2c573867c37f3e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18999, "upload_time": "2019-10-04T17:44:53", "url": "https://files.pythonhosted.org/packages/49/c7/20d46334fe058ef00c44d1608b4c6206f3b545ce23edbd9ca9e6f435079a/Truvari-1.3.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5660efb830378503c88be6c416ddbbdf", "sha256": "2e66fa83c1c39b4cbac58e7a7c64e1cde5d9ee4055dd548b8ea9ee81d56362ab" }, "downloads": -1, "filename": "Truvari-1.3.2.tar.gz", "has_sig": false, "md5_digest": "5660efb830378503c88be6c416ddbbdf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17925, "upload_time": "2019-10-04T17:44:55", "url": "https://files.pythonhosted.org/packages/b1/63/134dec6010259f9df6c5450a1bd10c0ca5d0e13d8497ec3c1e9ddaf6297c/Truvari-1.3.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "1eba10e7688d9c627b2c573867c37f3e", "sha256": "fe3ccc4d94186c25ee2c5455fb364b73b9ef72db73162a118488ed1fdc2b01c0" }, "downloads": -1, "filename": "Truvari-1.3.2-py3-none-any.whl", "has_sig": false, "md5_digest": "1eba10e7688d9c627b2c573867c37f3e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18999, "upload_time": "2019-10-04T17:44:53", "url": "https://files.pythonhosted.org/packages/49/c7/20d46334fe058ef00c44d1608b4c6206f3b545ce23edbd9ca9e6f435079a/Truvari-1.3.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5660efb830378503c88be6c416ddbbdf", "sha256": "2e66fa83c1c39b4cbac58e7a7c64e1cde5d9ee4055dd548b8ea9ee81d56362ab" }, "downloads": -1, "filename": "Truvari-1.3.2.tar.gz", "has_sig": false, "md5_digest": "5660efb830378503c88be6c416ddbbdf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17925, "upload_time": "2019-10-04T17:44:55", "url": "https://files.pythonhosted.org/packages/b1/63/134dec6010259f9df6c5450a1bd10c0ca5d0e13d8497ec3c1e9ddaf6297c/Truvari-1.3.2.tar.gz" } ] }