{ "info": { "author": "23andMe Engineering", "author_email": "mstrand@23andme.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4" ], "description": "SeqSeek [![Build Status](https://travis-ci.org/23andMe/seqseek.svg?branch=master)](https://travis-ci.org/23andMe/seqseek)\n=================\nEasy access to human reference genome sequences.\n\nThis package calls open(file).seek(range) on FASTA files of ASCII characters to provide\nranges of sequence strings. It is exactly as fast as your disk, for better or worse. \n\nRequirements\n------------\n* Python 2.7+ or Python 3.4+\n\nInstall\n-------\n### pip\n```bash\n$ pip install seqseek\n```\n\n### Download Utilities\n```bash\n$ download_build_37 \n$ download_build_38 \n```\nThese commands check to see which chromosomes need to be downloaded, fetch any missing \nfiles, remove newline characters, and run build-specific integrity tests. \nThe sequence files are downloaded from our Amazon S3 bucket which contains\nFASTA-formatted sequence files obtained from NCBI's nucleotide database \n(e.g. [NC_000001.11](https://www.ncbi.nlm.nih.gov/nuccore/NC_000001.11)).\n\n\n### Test Utilities\n```bash\n$ test_build_37\n$ test_build_38\n```\nThese commands run build specific tests to ensure the chromosome files have been\ndownloaded correctly. These tests read sequences from each chromosome file and\ncompare the extracted sequence with sequences pulled from https://genome.ucsc.edu.\n\n\n### Using the seqseek package\n```python\nfrom seqseek import Chromosome\n```\nImport the chromosome class from the seqseek package.\n\n```python\nChromosome(17).sequence(141224, 141244) #=> TTTCCTGAGAGTTCCAGTGA\n```\nThe command above will return a string of 20 nucleotides found between interbase \npositions 141224-141244 on chromosome 17. SeqSeek currently defaults to build\n37 to match the coordinates used by the 23andMe website and raw data downloads. \n\n---\n\n```python\nfrom seqseek import Chromosome, BUILD37, BUILD38\nChromosome(17, assembly=BUILD38).sequence(141224, 141244) #=> ACCTGGTGAGGGGACATGGG\n```\nYou can explicitly specify either build 37 or build 38 using the `BUILD37` and `BUILD38` \nconstants and the `assembly` keyword argument. \n\n---\n\n```python\nChromosome('NC_000017.11'').sequence(141224, 141244) #=> ACCTGGTGAGGGGACATGGG\n```\nYou can also load a chromosome directly by an accession name instead of specifying both \nthe common name and the genome assembly. \n\n\n### The Mitochondria \nThe mitochondria is a circular piece of DNA and it is sometimes useful to\nretrieve sequences that extend beyond the min or max coordinates of the contig\nand loop back to the beginning or end. This is mainly useful for pulling\nflanking sequences for designing oligonucleotide probes near the extreme 3' and\n5' regions of the mitochondria but there may be other applications as well.\n\nWe never return sequences that are longer than the length of the contig.\nAttempts to load such a sequence raise a TooManyLoops exception\n\nThis behavior can be requested by passing `loop=True` when loading the\nmitochondria by name. These two invocations return the same sequence: \n\n```python\nChromosome('MT', loop=True).sequence(-5, 5) # negative start coordinate \nChromosome('MT', loop=True).sequence(16564, 16574) # out of bounds end coordinate\n```\n\nSeqSeek uses the revised Cambridge Reference Sequence (rCRS) for the mitochondria on \nboth build 37 and 38. If you need access to the out-of-date RSRS sequence for\nbackward-compatibility then you may load it directly by accession (`NC_001807.4`). \n\nThe rCRS mitochondria sequence contains an 'N' base at position 3106-3107 to\npreserve legacy nucleotide numbering. This can be useful for using legacy\ncoordinates but but is impractical when working with sequences that are\nexpected to align to observed human mitochondrial sequences. SeqSeek\nremoves this `N` unless it is explicitly requested by passing `RCRS_N_remove=False`.\n\n```python\nChromosome('MT').sequence(3106, 3107) # => ''\nChromosome('MT').sequence(3106, 3108) # => 'T'\n```\n\n\n### Supported chromosome names and accessions \nSeqSeek uses the following common chromosome names: \n`1`, `2`, ..., `22`, `X`, `Y`, and `MT`. \n\nThe full list of supported accessions is as follows:\n* NC_000001.10\n* NC_000001.11\n* NC_000002.11\n* NC_000002.12\n* NC_000003.11\n* NC_000003.12\n* NC_000004.11\n* NC_000004.12\n* NC_000005.9\n* NC_000005.10\n* NC_000006.11\n* NC_000006.12\n* NC_000007.13\n* NC_000007.14\n* NC_000008.10\n* NC_000008.11\n* NC_000009.11\n* NC_000009.12\n* NC_000010.10\n* NC_000010.11\n* NC_000011.9\n* NC_000011.10\n* NC_000012.11\n* NC_000012.12\n* NC_000013.10\n* NC_000013.11\n* NC_000014.8\n* NC_000014.9\n* NC_000015.9\n* NC_000015.10\n* NC_000016.9\n* NC_000016.10\n* NC_000017.10\n* NC_000017.11\n* NC_000018.9\n* NC_000018.10\n* NC_000019.9\n* NC_000019.10\n* NC_000020.10\n* NC_000020.11\n* NC_000021.8\n* NC_000021.9\n* NC_000022.10\n* NC_000022.11\n* NC_000023.10\n* NC_000023.11\n* NC_000024.10\n* NC_000024.9\n* NC_001807.4\n* NC_012920.1\n* NT_113891.2\n* NT_167244.1\n* NT_167245.1\n* NT_167246.1\n* NT_167247.1\n* NT_167248.1\n* NT_167249.1\n* NT_167250.1\n* NT_167251.1", "description_content_type": "", "docs_url": null, "download_url": "https://github.com/23andMe/seqseek/tarball/0.4.1", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/23andMe/seqseek", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "seqseek", "package_url": "https://pypi.org/project/seqseek/", "platform": "", "project_url": "https://pypi.org/project/seqseek/", "project_urls": { "Download": "https://github.com/23andMe/seqseek/tarball/0.4.1", "Homepage": "https://github.com/23andMe/seqseek" }, "release_url": "https://pypi.org/project/seqseek/0.4.1/", "requires_dist": null, "requires_python": "", "summary": "Easy access to human reference genome sequences", "version": "0.4.1" }, "last_serial": 5185727, "releases": { "0.1.10": [ { "comment_text": "", "digests": { "md5": "6abeaa1d26da229bae195fcb2821e531", "sha256": "bca4a0652c0dfd8e35edbadf38fd7d3e89db110545a81da8e0c7386f90c0ecde" }, "downloads": -1, "filename": "seqseek-0.1.10.tar.gz", "has_sig": false, "md5_digest": "6abeaa1d26da229bae195fcb2821e531", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10308, "upload_time": "2016-04-06T18:46:52", "url": "https://files.pythonhosted.org/packages/33/fb/28cae9230bc4128266794509445ddd5e94bf9b6090a72688de17652e35e8/seqseek-0.1.10.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "0736f370d309efba7f176119ef52ce5c", "sha256": "885e7016b15a424ed38f0fab831b8263431f688ee251fe070b6c07607af11247" }, "downloads": -1, "filename": "seqseek-0.1.4.tar.gz", "has_sig": false, "md5_digest": "0736f370d309efba7f176119ef52ce5c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6313, "upload_time": "2016-01-29T20:37:18", "url": "https://files.pythonhosted.org/packages/92/ed/6eab61ee6edcd2ac5666e8377d57bdbdd1014bca150fd5a4f3cb25e40ad0/seqseek-0.1.4.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "76db827549ac4040b125bf578e53ae10", "sha256": "849553d44d4dfb89ee30df0be18f4c26708593f534e06ae285d215eda5968c77" }, "downloads": -1, "filename": "seqseek-0.1.5.tar.gz", "has_sig": false, "md5_digest": "76db827549ac4040b125bf578e53ae10", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6336, "upload_time": "2016-01-29T22:57:51", "url": "https://files.pythonhosted.org/packages/42/c3/d20e2a443b8e233bd7aa5070d3fcdb6423313a70c3d89a93f0b12d1f9ce4/seqseek-0.1.5.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "17028b4ad1849de3240ed145ff6b5686", "sha256": "110f60eadfb3db6f6c94cb6570a34beb073fbd010512199b844a9b003167f198" }, "downloads": -1, "filename": "seqseek-0.1.6.tar.gz", "has_sig": false, "md5_digest": "17028b4ad1849de3240ed145ff6b5686", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6320, "upload_time": "2016-01-29T23:26:25", "url": "https://files.pythonhosted.org/packages/2e/27/722e553c3ded44dde5d20267d4adc9f2baa201245d492d632de3e2a3071c/seqseek-0.1.6.tar.gz" } ], "0.1.7": [ { "comment_text": "", "digests": { "md5": "291984907dcea009b54ebe7076a8440d", "sha256": "a49be185f640ed62c54ca5692296f6377402287984974161fa42dbe0f5bc794c" }, "downloads": -1, "filename": "seqseek-0.1.7.tar.gz", "has_sig": false, "md5_digest": "291984907dcea009b54ebe7076a8440d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10292, "upload_time": "2016-02-19T22:41:41", "url": "https://files.pythonhosted.org/packages/c8/2f/83b8927041e1632d0a449611cae452feaa5768682e04939920ad72675d53/seqseek-0.1.7.tar.gz" } ], "0.1.8": [ { "comment_text": "", "digests": { "md5": "df415e62755bf2645ec0872644aca30d", "sha256": "82dedc5e0db812feaa4154b171ba01886756a97e1456b96d0cab9314612ee0fc" }, "downloads": -1, "filename": "seqseek-0.1.8.tar.gz", "has_sig": false, "md5_digest": "df415e62755bf2645ec0872644aca30d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10281, "upload_time": "2016-04-06T18:34:46", "url": "https://files.pythonhosted.org/packages/6d/a4/4ac5649dc6c94e83c9e6b1d3d36792ad32a6f2c88a84fe9439be2b784f7b/seqseek-0.1.8.tar.gz" } ], "0.1.9": [ { "comment_text": "", "digests": { "md5": "42451ba648ae1dc0ad30c484c9704bdb", "sha256": "ae262426390a28038db36b90a75cdcd79bdde19649e7270dea5cba5d1e161815" }, "downloads": -1, "filename": "seqseek-0.1.9.tar.gz", "has_sig": false, "md5_digest": "42451ba648ae1dc0ad30c484c9704bdb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10302, "upload_time": "2016-04-06T18:40:23", "url": "https://files.pythonhosted.org/packages/58/b6/30d5f0a761745d5da3f47a9ef8160830b8f6264a428c27ee7ee9b7666040/seqseek-0.1.9.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "45db3b7b619adbc0cb812f4fe70bff34", "sha256": "3027862829b0b63120a1cbe12e01ee2338433db74a4a7ca472f4a0068cd6e7d5" }, "downloads": -1, "filename": "seqseek-0.2.0.tar.gz", "has_sig": false, "md5_digest": "45db3b7b619adbc0cb812f4fe70bff34", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11108, "upload_time": "2016-10-11T07:09:54", "url": "https://files.pythonhosted.org/packages/3d/4f/8112ed3dfe4cb05ee9c8b15b7424ecd1b3c25147ab64beb34ce5783e89cd/seqseek-0.2.0.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "2939dcf717d7d2c2998c4816cb2577be", "sha256": "f82601db4df8d152af3fab85b0beca3d3473973af013a5dad652367247d3bcce" }, "downloads": -1, "filename": "seqseek-0.2.3.tar.gz", "has_sig": false, "md5_digest": "2939dcf717d7d2c2998c4816cb2577be", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12849, "upload_time": "2016-10-13T01:15:45", "url": "https://files.pythonhosted.org/packages/46/30/c046dbd0ed41dd540784f2955c2310ed963e90a3cdd1c390e7ca973fc113/seqseek-0.2.3.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "eb4af2f347c950cd2b9ca9986107f304", "sha256": "19f221404fdd40233db720ffed9ce6d08469c9da6651febdb365f84eaa2ce899" }, "downloads": -1, "filename": "seqseek-0.3.0.tar.gz", "has_sig": false, "md5_digest": "eb4af2f347c950cd2b9ca9986107f304", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14929, "upload_time": "2016-10-28T22:47:00", "url": "https://files.pythonhosted.org/packages/8d/63/246794da6c31894c0899043b588024367ceab1d4155e83f1684b74359e84/seqseek-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "f50c2ebc34e29f02f1932d17895273f8", "sha256": "73df9dfa3bf4c201b7610a381dc808ae0c8bc3f82cbb3461dfcc19ca643b9c0d" }, "downloads": -1, "filename": "seqseek-0.3.1.tar.gz", "has_sig": false, "md5_digest": "f50c2ebc34e29f02f1932d17895273f8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14855, "upload_time": "2017-02-07T21:30:06", "url": "https://files.pythonhosted.org/packages/b3/2e/a8046e61c2f24c58173bd72ddfc2f85f7401209bc0a01b5ac89dbde5fd60/seqseek-0.3.1.tar.gz" } ], "0.3.2": [ { "comment_text": "", "digests": { "md5": "8bf587f5bf9ccfe14a387d6355a8025e", "sha256": "eda6af052b6153a0434d02af616cc458701ef030486c5eb6ecbcee0e32565a01" }, "downloads": -1, "filename": "seqseek-0.3.2.tar.gz", "has_sig": false, "md5_digest": "8bf587f5bf9ccfe14a387d6355a8025e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14830, "upload_time": "2017-03-03T03:16:26", "url": "https://files.pythonhosted.org/packages/ff/9a/17ca3713da9a767b25c424f4f82345938804f00f96ac1f841402e334858b/seqseek-0.3.2.tar.gz" } ], "0.3.3": [ { "comment_text": "", "digests": { "md5": "15e0dea86df81ef137d51eaf9e77f597", "sha256": "26aaae680cc4d13aa72909412f9c2f89d81687923c6af0b272bcdcbf689f08ab" }, "downloads": -1, "filename": "seqseek-0.3.3.tar.gz", "has_sig": false, "md5_digest": "15e0dea86df81ef137d51eaf9e77f597", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14831, "upload_time": "2017-03-03T03:31:10", "url": "https://files.pythonhosted.org/packages/71/a1/a3ff8d5510d73c814c0e650d23d3917a69caacd653205b5e7551eff75e29/seqseek-0.3.3.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "99f05425909a237b83d0a21604848234", "sha256": "f09c4d4a304720af8d20e5f4492c60969dbf47c7f13e2fe99ab4dcc55d09daf8" }, "downloads": -1, "filename": "seqseek-0.4.0.tar.gz", "has_sig": false, "md5_digest": "99f05425909a237b83d0a21604848234", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14836, "upload_time": "2018-03-07T09:35:59", "url": "https://files.pythonhosted.org/packages/8f/49/dd7b1bcb49a761543a2adfabc9506834e3179f0d6f28a79dae111447b560/seqseek-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "b646bde5a9caee57a767603c14040da9", "sha256": "65ca8999b6fda8a9269499d20943c4c70854fe32d587fec8c6fc75dbfdeaa4a3" }, "downloads": -1, "filename": "seqseek-0.4.1.tar.gz", "has_sig": false, "md5_digest": "b646bde5a9caee57a767603c14040da9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15014, "upload_time": "2019-04-25T02:22:33", "url": "https://files.pythonhosted.org/packages/d3/d2/eddc6a23224970b2346f2de0790ec8a3537871cd821380a512726634cc07/seqseek-0.4.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b646bde5a9caee57a767603c14040da9", "sha256": "65ca8999b6fda8a9269499d20943c4c70854fe32d587fec8c6fc75dbfdeaa4a3" }, "downloads": -1, "filename": "seqseek-0.4.1.tar.gz", "has_sig": false, "md5_digest": "b646bde5a9caee57a767603c14040da9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15014, "upload_time": "2019-04-25T02:22:33", "url": "https://files.pythonhosted.org/packages/d3/d2/eddc6a23224970b2346f2de0790ec8a3537871cd821380a512726634cc07/seqseek-0.4.1.tar.gz" } ] }