{ "info": { "author": "Andrew Riha", "author_email": "apriha@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "Intended Audience :: End Users/Desktop", "Intended Audience :: Healthcare Industry", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Bio-Informatics", "Topic :: Scientific/Engineering :: Information Analysis", "Topic :: Utilities" ], "description": ".. image:: https://raw.githubusercontent.com/apriha/snps/master/docs/images/snps_banner.png\n\n|build| |codecov| |docs| |pypi| |python| |downloads|\n\nsnps\n====\ntools for reading, writing, merging, and remapping SNPs \ud83e\uddec\n\nCapabilities\n------------\n- Read raw data (genotype) files from a variety of direct-to-consumer (DTC) DNA testing sources\n- Read and write VCF files for Builds 36, 37, and 38 (e.g., convert `23andMe `_ to VCF)\n- Merge raw data files from different DNA tests, identifying discrepant SNPs in the process\n- Remap SNPs between assemblies / builds (e.g., convert SNPs from Build 36 to Build 37, etc.)\n\nSupported Genotype Files\n------------------------\n``snps`` supports `VCF `_ files and\ngenotype files from the following DNA testing sources:\n\n- `23andMe `_\n- `Ancestry `_\n- `C\u00f3digo 46 `_\n- `Family Tree DNA `_\n- `Genes for Good `_\n- `LivingDNA `_\n- `Mapmygenome `_\n- `MyHeritage `_\n\nDependencies\n------------\n``snps`` requires `Python `_ 3.5+ and the following Python packages:\n\n- `numpy `_\n- `pandas `_\n- `atomicwrites `_\n- `PyVCF `_\n\nInstallation\n------------\n``snps`` is `available `_ on the\n`Python Package Index `_. Install ``snps`` (and its required\nPython dependencies) via ``pip``::\n\n $ pip install snps\n\nExamples\n--------\nDownload Example Data\n`````````````````````\nLet's download some example data from `openSNP `_:\n\n>>> from snps.resources import Resources\n>>> r = Resources()\n>>> paths = r.download_example_datasets()\nDownloading resources/662.23andme.340.txt.gz\nDownloading resources/662.ftdna-illumina.341.csv.gz\n\nLoad Raw Data\n`````````````\nLoad a `23andMe `_ raw data file:\n\n>>> from snps import SNPs\n>>> s = SNPs('resources/662.23andme.340.txt.gz')\n\nThe loaded SNPs are available via a ``pandas.DataFrame``:\n\n>>> df = s.snps\n>>> df.columns.values\narray(['chrom', 'pos', 'genotype'], dtype=object)\n>>> df.index.name\n'rsid'\n>>> len(df)\n991786\n\n``snps`` also attempts to detect the build / assembly of the data:\n\n>>> s.build\n37\n>>> s.build_detected\nTrue\n>>> s.assembly\n'GRCh37'\n\nRemap SNPs\n``````````\nLet's remap the SNPs to change the assembly / build:\n\n>>> s.snps.loc[\"rs3094315\"].pos\n752566\n>>> chromosomes_remapped, chromosomes_not_remapped = s.remap_snps(38)\nDownloading resources/GRCh37_GRCh38.tar.gz\n>>> s.build\n38\n>>> s.assembly\n'GRCh38'\n>>> s.snps.loc[\"rs3094315\"].pos\n817186\n\nSNPs can be remapped between Build 36 (``NCBI36``), Build 37 (``GRCh37``), and Build 38\n(``GRCh38``).\n\nMerge Raw Data Files\n````````````````````\nThe dataset consists of raw data files from two different DNA testing sources. Let's combine\nthese files using a ``SNPsCollection``.\n\n>>> from snps import SNPsCollection\n>>> sc = SNPsCollection(\"resources/662.ftdna-illumina.341.csv.gz\", name=\"User662\")\nLoading resources/662.ftdna-illumina.341.csv.gz\n>>> sc.build\n36\n>>> chromosomes_remapped, chromosomes_not_remapped = sc.remap_snps(37)\nDownloading resources/NCBI36_GRCh37.tar.gz\n>>> sc.snp_count\n708092\n\nAs the data gets added, it's compared to the existing data, and SNP position and genotype\ndiscrepancies are identified. (The discrepancy thresholds can be tuned via parameters.)\n\n>>> sc.load_snps([\"resources/662.23andme.340.txt.gz\"], discrepant_genotypes_threshold=300)\nLoading resources/662.23andme.340.txt.gz\n27 SNP positions were discrepant; keeping original positions\n151 SNP genotypes were discrepant; marking those as null\n>>> len(sc.discrepant_snps) # SNPs with discrepant positions and genotypes, dropping dups\n169\n>>> sc.snp_count\n1006960\n\nSave SNPs\n`````````\nOk, so far we've remapped the SNPs to the same build and merged the SNPs from two files,\nidentifying discrepancies along the way. Let's save the merged dataset consisting of over 1M+\nSNPs to a CSV file:\n\n>>> saved_snps = sc.save_snps()\nSaving output/User662_GRCh37.csv\n\nMoreover, let's get the reference sequences for this assembly and save the SNPs as a VCF file:\n\n>>> saved_snps = sc.save_snps(vcf=True)\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.1.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.2.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.3.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.4.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.5.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.6.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.7.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.8.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.9.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.10.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.11.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.12.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.13.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.14.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.15.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.16.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.17.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.18.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.19.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.20.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.21.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.22.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.X.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.Y.fa.gz\nDownloading resources/fasta/GRCh37/Homo_sapiens.GRCh37.dna.chromosome.MT.fa.gz\nSaving output/User662_GRCh37.vcf\n\nAll `output files `_ are saved to the\noutput directory.\n\nDocumentation\n-------------\nDocumentation is available `here `_.\n\nAcknowledgements\n----------------\nThanks to Mike Agostino, Padma Reddy, Kevin Arvai, `openSNP `_,\n`Open Humans `_, and `Sano Genetics `_.\n\n.. https://github.com/rtfd/readthedocs.org/blob/master/docs/badges.rst\n.. |build| image:: https://travis-ci.org/apriha/snps.svg?branch=master\n :target: https://travis-ci.org/apriha/snps\n.. |codecov| image:: https://codecov.io/gh/apriha/snps/branch/master/graph/badge.svg\n :target: https://codecov.io/gh/apriha/snps\n.. |docs| image:: https://readthedocs.org/projects/snps/badge/?version=latest\n :target: https://snps.readthedocs.io/\n.. |pypi| image:: https://img.shields.io/pypi/v/snps.svg\n :target: https://pypi.python.org/pypi/snps\n.. |python| image:: https://img.shields.io/pypi/pyversions/snps.svg\n :target: https://www.python.org\n.. |downloads| image:: https://pepy.tech/badge/snps\n :target: https://pepy.tech/project/snps\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/apriha/snps", "keywords": "snps dna chromosomes bioinformatics", "license": "BSD 3-Clause License", "maintainer": "", "maintainer_email": "", "name": "snps", "package_url": "https://pypi.org/project/snps/", "platform": "any", "project_url": "https://pypi.org/project/snps/", "project_urls": { "Changelog": "https://github.com/apriha/snps/releases", "Homepage": "https://github.com/apriha/snps", "Issue Tracker": "https://github.com/apriha/snps/issues" }, "release_url": "https://pypi.org/project/snps/0.5.0/", "requires_dist": [ "numpy", "pandas", "atomicwrites", "PyVCF" ], "requires_python": ">=3.5", "summary": "tools for reading, writing, merging, and remapping SNPs", "version": "0.5.0" }, "last_serial": 5988056, "releases": { "0.0.0": [ { "comment_text": "", "digests": { "md5": "4e7400d73fdc1e55791047dab909a485", "sha256": "6cccd6f4584824cedd246646ecc5556ccb8bba1da033cd70e57a8bf3e607f13c" }, "downloads": -1, "filename": "snps-0.0.0.tar.gz", "has_sig": false, "md5_digest": "4e7400d73fdc1e55791047dab909a485", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 2849, "upload_time": "2019-06-09T22:15:56", "url": "https://files.pythonhosted.org/packages/10/47/8853a3b6395c434916081561fc93823e2d58c2458f15367f63810ebf19c6/snps-0.0.0.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "63af148c339200adb99b70adc92233ad", "sha256": "584e1b797d0a62160951ef2989c5adfbb18e7eb1b035c50e986e42dc611e6c11" }, "downloads": -1, "filename": "snps-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "63af148c339200adb99b70adc92233ad", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 35166, "upload_time": "2019-06-12T08:03:40", "url": "https://files.pythonhosted.org/packages/be/d1/7f45b11548cce734b1487aeda53db20162efa9fbb5ec1ffe94e580668335/snps-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a72bb4982cb508bfa52891ae1a8bebf2", "sha256": "a3da274e56e85e42c91ae419fefbbcc18190003c54f464d1a767f26e5b39b06f" }, "downloads": -1, "filename": "snps-0.1.0.tar.gz", "has_sig": false, "md5_digest": "a72bb4982cb508bfa52891ae1a8bebf2", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 65444, "upload_time": "2019-06-12T08:03:41", "url": "https://files.pythonhosted.org/packages/1b/ba/182bb1f759cdb66b8ee80f7080248765c8b945091a4b7e621aabda98b13a/snps-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "96c1f3a43af9447832e42333643c01ad", "sha256": "4c92549aa12a27013b42a046e2a2d11d00f35b7f978ddf5f4fef3563a9fc7eae" }, "downloads": -1, "filename": "snps-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "96c1f3a43af9447832e42333643c01ad", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 35193, "upload_time": "2019-06-16T18:17:25", "url": "https://files.pythonhosted.org/packages/ef/72/09c39d104da2749efc1b59fd405f18ab57356100b4b550d7c0f8dd78b441/snps-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "66a6d5e548ff217145b3c8355cc33ea4", "sha256": "35629ea539619cb01dc5c68054273c40db2f0bfac99bc1c866f02b4a6cf013c1" }, "downloads": -1, "filename": "snps-0.1.1.tar.gz", "has_sig": false, "md5_digest": "66a6d5e548ff217145b3c8355cc33ea4", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 65463, "upload_time": "2019-06-16T18:17:27", "url": "https://files.pythonhosted.org/packages/62/b3/2483ee9c0f2fa76bfe14f99aa338816695c337e9588eca88fb2d37894be5/snps-0.1.1.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "b83ec6d501c606a7406653e0d4efdf0e", "sha256": "b180a1c3c087b92913b22c93ed1bd84ed81f9cc5264a125835f9eee6769a48af" }, "downloads": -1, "filename": "snps-0.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "b83ec6d501c606a7406653e0d4efdf0e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 35340, "upload_time": "2019-06-20T06:29:38", "url": "https://files.pythonhosted.org/packages/56/05/31f22e8d4d91fbfc962801ab70d88e27d2e80746332d340b403903739772/snps-0.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "416712570b646633c9459ab5c7644287", "sha256": "1ae1b738325858a27d8d7cc682f30171174badda761cd32fe6a6bbed853d4269" }, "downloads": -1, "filename": "snps-0.2.0.tar.gz", "has_sig": false, "md5_digest": "416712570b646633c9459ab5c7644287", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 65824, "upload_time": "2019-06-20T06:29:40", "url": "https://files.pythonhosted.org/packages/46/8c/3a8cc337995931da456b029834de166d13b116f4ea7bd62758bb24408953/snps-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "78502580cca8056351a5704d33184eb1", "sha256": "703189c6c5eeea22256c6d4fa70ae845eb976e5f99e89c9a00541ed9f928b89a" }, "downloads": -1, "filename": "snps-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "78502580cca8056351a5704d33184eb1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 35329, "upload_time": "2019-07-08T00:05:52", "url": "https://files.pythonhosted.org/packages/db/7f/b3d95a38e641ba8b41405acbd815d123d31c48cd4a6955c2116c60c25b30/snps-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e33c2f6eeca7d1811d5919f95c5fe441", "sha256": "acace20f8314ea0b80dfb840f8b2c4316c3d6ee8799f01fa1c6c4d0e16ac37d8" }, "downloads": -1, "filename": "snps-0.2.1.tar.gz", "has_sig": false, "md5_digest": "e33c2f6eeca7d1811d5919f95c5fe441", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 65832, "upload_time": "2019-07-08T00:05:54", "url": "https://files.pythonhosted.org/packages/3a/04/15f97db3ca54022624be3e2ec350e70741ea1ab2280d9d0c0a651673ef38/snps-0.2.1.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "e81a3b732daf5a71328779ab81a1580f", "sha256": "292a8ac356e05289058982a5c00b4de58c9bbfd9844a601539b7a631151da289" }, "downloads": -1, "filename": "snps-0.3.0-py3-none-any.whl", "has_sig": false, "md5_digest": "e81a3b732daf5a71328779ab81a1580f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 36024, "upload_time": "2019-07-29T01:02:10", "url": "https://files.pythonhosted.org/packages/1a/ee/16aa227bb73727c7fe6ea95f9c9109333479be6c9fdedde268bc0a584535/snps-0.3.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9e43ffcd5c6a1c9b6474f43a5b474072", "sha256": "92056da2dd053212c054a272abfa84c245a1ba313446fddc10f19d8c6ada5cb3" }, "downloads": -1, "filename": "snps-0.3.0.tar.gz", "has_sig": false, "md5_digest": "9e43ffcd5c6a1c9b6474f43a5b474072", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 98125, "upload_time": "2019-07-29T01:02:13", "url": "https://files.pythonhosted.org/packages/9a/a5/fb0df33495d328550cb6815286fed5d146cdc8ad33be877d0a881f32a614/snps-0.3.0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "095548bd9d8db92493fd56330b45dff1", "sha256": "79d35fd6ee5ed02e45914d820a3eb535be6908d924a12cdb958c24cc0d7813c8" }, "downloads": -1, "filename": "snps-0.4.0-py3-none-any.whl", "has_sig": false, "md5_digest": "095548bd9d8db92493fd56330b45dff1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 36829, "upload_time": "2019-10-06T18:14:02", "url": "https://files.pythonhosted.org/packages/22/9f/020f9f56bc168450d12955f33c543541f7c509503f59186a9bf2450ee1e0/snps-0.4.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "91fdfec99dbb72b8daf5b3ded15d61b5", "sha256": "a4ce0a5f3ad4bf4f120debf60b223513951b453cd683bc87bc73458335586488" }, "downloads": -1, "filename": "snps-0.4.0.tar.gz", "has_sig": false, "md5_digest": "91fdfec99dbb72b8daf5b3ded15d61b5", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 99678, "upload_time": "2019-10-06T18:14:04", "url": "https://files.pythonhosted.org/packages/d0/3e/13c801961479d8705a2247084388d855db69491bff3136537c9573a457f4/snps-0.4.0.tar.gz" } ], "0.5.0": [ { "comment_text": "", "digests": { "md5": "6093f3e79ade4a07bee5eb4935363c3c", "sha256": "c424455ee72a8ba7802d697f0496ef11f5d6d265d1218eda6f9646d4076d4ff9" }, "downloads": -1, "filename": "snps-0.5.0-py3-none-any.whl", "has_sig": false, "md5_digest": "6093f3e79ade4a07bee5eb4935363c3c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 36983, "upload_time": "2019-10-17T07:05:34", "url": "https://files.pythonhosted.org/packages/a9/34/adc8932b3055527a3f30f716185f86127dce62cdbb431427ef8d0e9ceb67/snps-0.5.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "18b86edbd8c379f23bae7c8507e71b1d", "sha256": "e99ab7a59d7812b0305985c9529229b9aadce18b487ff1a614d255e04abf4f92" }, "downloads": -1, "filename": "snps-0.5.0.tar.gz", "has_sig": false, "md5_digest": "18b86edbd8c379f23bae7c8507e71b1d", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 100214, "upload_time": "2019-10-17T07:05:35", "url": "https://files.pythonhosted.org/packages/b8/77/1c9d44f31a9418007d9678211ea1368ef82e7abefed8c0c52d260bb85a6c/snps-0.5.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "6093f3e79ade4a07bee5eb4935363c3c", "sha256": "c424455ee72a8ba7802d697f0496ef11f5d6d265d1218eda6f9646d4076d4ff9" }, "downloads": -1, "filename": "snps-0.5.0-py3-none-any.whl", "has_sig": false, "md5_digest": "6093f3e79ade4a07bee5eb4935363c3c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 36983, "upload_time": "2019-10-17T07:05:34", "url": "https://files.pythonhosted.org/packages/a9/34/adc8932b3055527a3f30f716185f86127dce62cdbb431427ef8d0e9ceb67/snps-0.5.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "18b86edbd8c379f23bae7c8507e71b1d", "sha256": "e99ab7a59d7812b0305985c9529229b9aadce18b487ff1a614d255e04abf4f92" }, "downloads": -1, "filename": "snps-0.5.0.tar.gz", "has_sig": false, "md5_digest": "18b86edbd8c379f23bae7c8507e71b1d", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 100214, "upload_time": "2019-10-17T07:05:35", "url": "https://files.pythonhosted.org/packages/b8/77/1c9d44f31a9418007d9678211ea1368ef82e7abefed8c0c52d260bb85a6c/snps-0.5.0.tar.gz" } ] }