{ "info": { "author": "Daniel Standage", "author_email": "daniel.standage@nbacc.dhs.gov", "bugtrack_url": null, "classifiers": [ "Environment :: Console", "Framework :: IPython", "Framework :: Jupyter", "Intended Audience :: Legal Industry", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "# MicroHapDB\n\n[![MicroHapDB build status][cibadge]](https://github.com/bioforensics/MicroHapDB/actions)\n[![Install with bioconda][condabadge]](http://bioconda.github.io/recipes/microhapdb/README.html)\n[![BSD licensed][licensebadge]](https://github.com/bioforensics/MicroHapDB/blob/master/LICENSE.txt)\n\nDaniel Standage, 2018-2022\nhttps://github.com/bioforensics/microhapdb\n\n**MicroHapDB** is a portable database intended for scientists and researchers interested in microhaplotypes for forensic analysis.\nThe database includes a comprehensive collection of marker and allele frequency data from numerous databases and published research articles.[5-15]\nEffective number of allele (*Ae*)[2] and informativeness for assignment (*In*)[3] statistics are included so that markers can be ranked for different forensic applications.\nThe entire contents of the database are distributed with each copy of MicroHapDB, and instructions for adding private data to a local copy of the database are provided.\nMicroHapDB is designed to be user-friendly for both practitioners and researchers, supporting a range of access methods from browsing to simple text queries to complex queries to full programmatic access via a Python API.\nMicroHapDB is also designed as a community resource requiring minimal infrastructure to use and maintain.\n\n\n## Installation\n\nFor best results, install from [bioconda](https://bioconda.github.io/).\n\n```\nconda install -c bioconda microhapdb\n```\n\nTo make sure the package installed correctly:\n\n```\nconda install pytest\npytest --pyargs microhapdb --doctest-modules\n```\n\nConda ensures the correct installation of Python version 3 and the [Pandas][] library, which are required by MicroHapDB.\n\n\n## Usage\n\n### Browsing\n\nTyping `microhapdb marker` on the command line will print a complete listing of all microhap markers in MicroHapDB to your terminal window.\nThe commands `microhapdb population` and `microhapdb frequency` will do the same for population descriptions and allele frequencies.\n\n> *__WARNING__: it's unlikely the entire data table will fit on your screen at once, so you may have to scroll back in your terminal to view all rows of the table.*\n\nAlternatively, the files `marker.tsv`, `population.tsv`, and `frequency.tsv` can be opened in Excel or loaded into your statistics/datascience environment of choice.\nType `microhapdb --files` on the command line to see the location of these files.\n\n### Database queries\n\nThe `microhapdb lookup ` command searches all data tables for relevant records with a user-provided name, identifier, or description, such as `mh06PK-24844`, `rs8192488`, or `Yoruba`.\n\nThe `microhapdb marker ` command searches the microhap markers with one or more user-provided names or identifiers.\nThe command also supports region-based queries (such as `chr1` or `chr12:1000000-5000000`), and can print either a tabular report or a detailed report.\nRun `microhapdb marker --help` for additional details.\n\nThe `microhapdb population ` command searches the population & cohort table with one or more user-provided names or identifiers.\nRun `microhapdb population --help` for additional details.\n\nThe `microhapdb frequency --marker --population --allele ` command searches the allele frequency table.\nThe search can be restricted using all query terms (marker, population, and allele), or broadened by dropping one or more of the query terms.\nRun `microhapdb frequency --help` for additional details.\n\n\"MicroHapDB\n\n### Python API\n\nTo access MicroHapDB from Python, simply invoke `import microhapdb` and query the following tables.\n\n- `microhapdb.markers`\n- `microhapdb.populations`\n- `microhapdb.frequencies`\n\nEach is a [Pandas][][1] dataframe object, supporting convenient and efficient listing, subsetting, and query capabilities.\n\n\"MicroHapDB\n\nSee the [Pandas][] documentation for more details on dataframe access and query methods.\n\nMicroHapDB also includes 4 auxiliary tables, which may be useful in a variety of scenarios.\n\n- `microhapdb.variantmap`: contains a mapping of dbSNP variants to their corresponding microhap markers\n- `microhapdb.idmap`: cross-references external names and identifiers with internal MicroHapDB identifiers\n- `microhapdb.sequences`: contains the sequence spanning and flanking each microhap locus\n- `microhapdb.indels`: contains variant information for markers that include insertion/deletion variants\n\n\n## Ranking Markers\n\nMicroHapDB provides three criteria for ranking markers.\n\n- `Ae`: the marker's effective number of alleles (*Ae*) computed individually for 26 populations[2]; by default, the average of the 26 populations is shown, but the `--ae-pop` flag or the `microhapdb.set_ae_population` function can be used to specify a single population for with to display Ae values\n- `In`: Rosenberg's informativeness for assignment (*In*) computed on 26 populations[3]\n- `Fst`: fixation index (*FST*) computed on 26 populations[4]\n\nThe Ae statistic is a measure of the *within-population* allelic variation at a locus, which corresponds to the marker's diagnostic power for identification purposes.\nThe In and FST statistics measure *between-population* allelic variation at a locus, which corresponds to the marker's utility for predicting population of origin.\nPhased genotypes for 2,504 individuals from Phase 3 of the 1000 Genomes Project[14] are used to calculate these statistics.\n\n\n## Adding Markers to MicroHapDB\n\n> *I have some private microhap markers.\n> Is it possible to include these in my MicroHapDB queries without making them public?*\n\nCertainly!\nSee the [dbbuild](dbbuild/) directory for instructions on updating a private copy the database with additional sources of data.\n\n> *I have published (or am getting ready to publish) a new panel of microhap markers and allele frequencies.\n> Could you add these to MicroHapDB?*\n\nCertainly!\nThe instructions in the [dbbuild](dbbuild/) directory describe what data files are required.\nWe would be happy to assist getting data into the correct format if that would help\u2014just let us know by opening a thread on [MicroHapDB's issue tracker](https://github.com/bioforensics/MicroHapDB/issues/new).\n\n> *My favorite data set of population-level genomic variation is not available in MicroHapDB.\n> Could you add it?*\n\nWe would be happy to consider including any data set with compatible licensing or user agreements.\nThe [dbbuild](dbbuild/) directory describes the information required for marker and population definitions.\nComputing allele frequencies from population data typically requires variant calls (VCFs) with phased genotypes for individuals.\nUse [MicroHapDB's issue tracker](https://github.com/bioforensics/MicroHapDB/issues/new) to contact us with any questions about including new public data sets.\n\n\n## Citation\n\nIf you use this database, please cite our work.\n\n> Standage DS, Mitchell RN (2020) MicroHapDB: A Portable and Extensible Database of All Published Microhaplotype Marker and Frequency Data. *Frontiers in Genetics* 11:781, [doi:10.3389/fgene.2020.00781](https://doi.org/10.3389/fgene.2020.00781).\n\n----------\n\n\n## References\n\n### Supporting Software\n\n[1]McKinney W (2010) Data structures for statistical computing in Python. *Proceedings of the 9th Python in Science Conference, 51-56*.\n\n### Ranking Statistics\n\n[2]Crow JF, Kimura M (1970) An Introduction to Population Genetics Theory. New York, Harper & Row.\n\n[3]Rosenberg NA, Li LM, Ward R, Pritchard JK (2003) Informativeness of genetic markers for inference of ancestry. *American Journal of Human Genetics*, 73(6):1402\u20131422, [doi:10.1086/380416](https://doi.org/10.1086/380416).\n\n[4]Weir B, Cockerham, C (1984) Estimating F-Statistics for the Analysis of Population Structure. *Evolution* 38(6):1358-1370, [doi:10.2307/2408641](https://doi.org/10.2307/2408641).\n\n### Published Marker collections and Allele Frequency Data\n\n[5]Rajeevan H, Soundararajan U, Kidd JR, Pakstis AJ, Kidd KK (2012) ALFRED: an allele frequency resource for research and teaching. *Nucleic Acids Research*, 40(D1): D1010-D1015, [doi:10.1093/nar/gkr924](https://doi.org/10.1093/nar/gkr924).\n\n[6]Kidd KK, Pakstis AJ, Speed WC, Lagace R, Wootton S, Chang J (2018) Selecting microhaplotypes optimized for different purposes. *Electrophoresis*, [doi:10.1002/elps.201800092](https://doi.org/10.1002/elps.201800092).\n\n[7]Kidd KK, Rajeevan H (2018) ALFRED data download. *The Allele Frequency Database*, https://alfred.med.yale.edu/alfred/selectDownload/Microhap_alleleF_198.txt. Accessed December 7, 2018.\n\n[8]van der Gaag KJ, de Leeuw RH, Laros JFJ, den Dunnen JT, de Knijff P (2018) Short hypervariable microhaplotypes: A novel set of very short high discriminating power loci without stutter artefacts. *Forensic Science International: Genetics*, 35:169-175, [doi:10.1016/j.fsigen.2018.05.008](https://doi.org/10.1016/j.fsigen.2018.05.008).\n\n[9]Staadig A, Tillmar A (2019) Evaluation of microhaplotypes\u2014A promising new type of forensic marker. *The 28th Congress of the International Society for Forensic Genetics*, P597.\n\n[10]Hiroaki N, Fujii K, Kitayama T, Sekiguchi K, Nakanishi H, Saito K (2015) Approaches for identifying multiple-SNP haplotype blocks for use in human identification. *Legal Medicine*, 17(5):415-420, [doi:10.1016/j.legalmed.2015.06.003](https://doi.org/10.1016/j.legalmed.2015.06.003).\n\n[11]Chen P, Deng C, Li Z, Pu Y, Yang J, Yu Y, Li K, Li D, Liang W, Zhang L, Chen F (2019) A microhaplotypes panel for massively parallel sequencing analysis of DNA mixtures. *FSI: Genetics*, 40:140-149, [doi:10.1016/j.fsigen.2019.02.018](https://doi.org/10.1016/j.fsigen.2019.02.018).\n\n[12]Voskoboinik L, Motro U, Darvasi A (2018) Facilitating complex DNA mixture interpretation by sequencing highly polymorphic haplotypes. *FSI: Genetics*, 35:136-140, [doi:10.1016/j.fsigen.2018.05.001](https://doi.org/10.1016/j.fsigen.2018.05.001).\n\n[13]de la Puente M, Phillips C, Xavier C, Amigo J, Carracedo A, Parson W, Lareu MV (2020) Building a custom large-scale panel of novel microhaplotypes for forensic identification using MiSeq and Ion S5 massively parallel sequencing systems. *FSI: Genetics*, 45:102213, [doi:10.1016/j.fsigen.2019.102213](https://doi.org/10.1016/j.fsigen.2019.102213).\n\n[14]Auton A, Abecasis G, Altshuler D, et al. (2015) A global reference for human genetic variation. *Nature* 526:68\u201374, [doi:10.1038/nature15393](https://doi.org/10.1038/nature15393).\n\n[15]Gandotra N, Speed WC, Qin W, Tang Y, Pakstis AJ, Kidd KK, Scharfe C (2020) Validation of novel forensic DNA markers using multiplex microhaplotype sequencing. *Forensic Science International: Genetics*, **47**:102275, [doi:10.1016/j.fsigen.2020.102275](https://doi.org/10.1016/j.fsigen.2020.102275).\n\n[alfred]: https://alfred.med.yale.edu/alfred/alfredDataDownload.asp\n[Pandas]: https://pandas.pydata.org\n[cibadge]: https://github.com/bioforensics/MicroHapDB/workflows/CI%20Build/badge.svg\n[pypibadge]: https://img.shields.io/pypi/v/microhapdb.svg\n[condabadge]: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg\n[licensebadge]: https://img.shields.io/badge/license-BSD-blue.svg\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/bioforensics/MicroHapDB", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "microhapdb", "package_url": "https://pypi.org/project/microhapdb/", "platform": "", "project_url": "https://pypi.org/project/microhapdb/", "project_urls": { "Homepage": "https://github.com/bioforensics/MicroHapDB" }, "release_url": "https://pypi.org/project/microhapdb/0.7/", "requires_dist": null, "requires_python": "", "summary": "Portable database of microhaplotype marker and allele frequency data", "version": "0.7", "yanked": false, "yanked_reason": null }, "last_serial": 12636714, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "9efa5c76e0f715fd7084be1f97e76a1a", "sha256": "aaf41f6fd9078d5d229bb9e5de920842d2e12432a261755a04fc55cc1dff023f" }, "downloads": -1, "filename": "microhapdb-0.1.0.tar.gz", "has_sig": false, "md5_digest": "9efa5c76e0f715fd7084be1f97e76a1a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24622, "upload_time": "2018-10-10T16:13:21", "upload_time_iso_8601": "2018-10-10T16:13:21.696184Z", "url": "https://files.pythonhosted.org/packages/ee/b9/90172ce0bd8dc37c5526c0674e88001a1c8dbc79067b860e4ac6b306ecf3/microhapdb-0.1.0.tar.gz", "yanked": false, "yanked_reason": null } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "e79419cebe9318ba60cdf762ebb911e3", "sha256": "c22e45f181094c54d1370c3ea0cb65f57d3d889a0fb87a76def67d5e50d4cc08" }, "downloads": -1, "filename": "microhapdb-0.1.1.tar.gz", "has_sig": false, "md5_digest": "e79419cebe9318ba60cdf762ebb911e3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 293851, "upload_time": "2018-10-10T16:59:13", "upload_time_iso_8601": "2018-10-10T16:59:13.716621Z", "url": "https://files.pythonhosted.org/packages/aa/72/2afb13e19bd12c8b2f01c4bf90765e286d187c9f8bdda7023018d30cdb2e/microhapdb-0.1.1.tar.gz", "yanked": false, "yanked_reason": null } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "7c0d382788bbdfb8df64a0f0d817c092", "sha256": "03e4ca5820741e67f04f85d777a7acd331640cbb4dd7dec56b22aecdb4676b70" }, "downloads": -1, "filename": "microhapdb-0.1.2.tar.gz", "has_sig": false, "md5_digest": "7c0d382788bbdfb8df64a0f0d817c092", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 293616, "upload_time": "2018-10-10T18:25:46", "upload_time_iso_8601": "2018-10-10T18:25:46.238229Z", "url": "https://files.pythonhosted.org/packages/1a/17/225f5ce31fdad8554b08a97c7777ed210156768868b3be7b82d85dff86d3/microhapdb-0.1.2.tar.gz", "yanked": false, "yanked_reason": null } ], "0.2": [ { "comment_text": "", "digests": { "md5": "2f9d3c7790cc027c904fea82711ecbb0", "sha256": "5d65dfc9eb53079076da7efadc9d0f5d32ffd3df88674c1bad70b36ca3b8d9e8" }, "downloads": -1, "filename": "microhapdb-0.2.tar.gz", "has_sig": false, "md5_digest": "2f9d3c7790cc027c904fea82711ecbb0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 917418, "upload_time": "2018-12-07T03:13:30", "upload_time_iso_8601": "2018-12-07T03:13:30.443185Z", "url": "https://files.pythonhosted.org/packages/79/93/42989a5c94cff6e2be24185e240319ad7bd1402858d944efa81e52ae3e5a/microhapdb-0.2.tar.gz", "yanked": false, "yanked_reason": null } ], "0.4": [ { "comment_text": "", "digests": { "md5": "fbc9ca12d3969c60752e114493a87d7b", "sha256": "e7468af01da9307b0f3d0520b5a98464d61384c7752e86b4ca0ad83284be4f71" }, "downloads": -1, "filename": "microhapdb-0.4.tar.gz", "has_sig": false, "md5_digest": "fbc9ca12d3969c60752e114493a87d7b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 526357, "upload_time": "2019-10-23T03:08:50", "upload_time_iso_8601": "2019-10-23T03:08:50.804868Z", "url": "https://files.pythonhosted.org/packages/50/45/d65815c6c175568cf697302430ebf30bf5dab5f224c218c94e1e663af338/microhapdb-0.4.tar.gz", "yanked": false, "yanked_reason": null } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "2e0132d54beef18163c24eeacd7ae3f1", "sha256": "252170ba75aa1365b607b1906e8eee3e5337c0b1cfa9fff5c730fed30047e511" }, "downloads": -1, "filename": "microhapdb-0.4.1.tar.gz", "has_sig": false, "md5_digest": "2e0132d54beef18163c24eeacd7ae3f1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 526374, "upload_time": "2019-10-23T03:41:19", "upload_time_iso_8601": "2019-10-23T03:41:19.458515Z", "url": "https://files.pythonhosted.org/packages/f8/37/a2430a19c7329b2f7b6c448f6f77f2856cdeb2a96ef28cf2ac252f010bee/microhapdb-0.4.1.tar.gz", "yanked": false, "yanked_reason": null } ], "0.4.3": [ { "comment_text": "", "digests": { "md5": "3545844d5c4e10807001dce80ad1cb29", "sha256": "064b325cc286e0081746ec672ca694fc1ee1cf0008dc8e1ffcee531fcbc35698" }, "downloads": -1, "filename": "microhapdb-0.4.3.tar.gz", "has_sig": false, "md5_digest": "3545844d5c4e10807001dce80ad1cb29", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 538140, "upload_time": "2019-11-06T15:53:53", "upload_time_iso_8601": "2019-11-06T15:53:53.956714Z", "url": "https://files.pythonhosted.org/packages/df/d3/2e76b9ddd603a07ec37966e91dc1a01ca6207a5b51e90e8b59010d3bd2db/microhapdb-0.4.3.tar.gz", "yanked": false, "yanked_reason": null } ], "0.7": [ { "comment_text": "", "digests": { "md5": "80064ac9a805448f92e89244d6b54de3", "sha256": "24535bfcd6903d2025b6000fb87bbbfa8da25d2344396c23467abc84e361b49b" }, "downloads": -1, "filename": "microhapdb-0.7.tar.gz", "has_sig": false, "md5_digest": "80064ac9a805448f92e89244d6b54de3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 929778, "upload_time": "2022-01-20T20:57:37", "upload_time_iso_8601": "2022-01-20T20:57:37.768976Z", "url": "https://files.pythonhosted.org/packages/2a/d8/af81dbd145449f53f85ef8e7b566c37a1404cbf04e96c57eafdc3eab3416/microhapdb-0.7.tar.gz", "yanked": false, "yanked_reason": null } ], "0.7rc1": [ { "comment_text": "", "digests": { "md5": "fc05f22c767d5684c9ee89caed69687c", "sha256": "c92155cb5af810ac9a9d3b7b42c74669802f92f3a443de18fa32fef48a745f52" }, "downloads": -1, "filename": "microhapdb-0.7rc1.tar.gz", "has_sig": false, "md5_digest": "fc05f22c767d5684c9ee89caed69687c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 929816, "upload_time": "2022-01-20T20:46:49", "upload_time_iso_8601": "2022-01-20T20:46:49.625544Z", "url": "https://files.pythonhosted.org/packages/87/8d/6680f68c36ab955b2ea417ddd235181fa4dec042a89dff156b693ccc4632/microhapdb-0.7rc1.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "80064ac9a805448f92e89244d6b54de3", "sha256": "24535bfcd6903d2025b6000fb87bbbfa8da25d2344396c23467abc84e361b49b" }, "downloads": -1, "filename": "microhapdb-0.7.tar.gz", "has_sig": false, "md5_digest": "80064ac9a805448f92e89244d6b54de3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 929778, "upload_time": "2022-01-20T20:57:37", "upload_time_iso_8601": "2022-01-20T20:57:37.768976Z", "url": "https://files.pythonhosted.org/packages/2a/d8/af81dbd145449f53f85ef8e7b566c37a1404cbf04e96c57eafdc3eab3416/microhapdb-0.7.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }