{ "info": { "author": "Steve Moss", "author_email": "gawbul@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Topic :: Internet", "Topic :: Scientific/Engineering :: Bio-Informatics", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "=============\npyEnsemblRest\n=============\n\n``pyEnsemblRest`` is a simple Python wrapper around the EnsEMBL REST API\n\n.. image:: https://travis-ci.org/pyOpenSci/pyEnsemblRest.svg?branch=master\n :target: https://travis-ci.org/pyOpenSci/pyEnsemblRest\n :alt: Build Status\n\n.. image:: https://coveralls.io/repos/github/pyOpenSci/pyEnsemblRest/badge.svg?branch=master\n :target: https://coveralls.io/github/pyOpenSci/pyEnsemblRest?branch=master\n :alt: Code Coverage\n\n.. image:: https://img.shields.io/scrutinizer/g/pyOpenSci/pyEnsemblRest.svg?maxAge=2592000\n :target: https://img.shields.io/scrutinizer/g/pyOpenSci/pyEnsemblRest.svg?maxAge=2592000\n :alt: Code Quality\n\n.. image:: https://img.shields.io/gitter/room/pyOpenSci/pyEnsemblRest.js.svg?maxAge=2592000\n :target: https://gitter.im/pyOpenSci/pyEnsemblRest\n :alt: Gitter Chat\n\n.. image:: https://img.shields.io/pypi/v/pyensemblrest.svg?maxAge=2592000\n :target: https://pypi.python.org/pypi/pyensemblrest\n :alt: PyPi Package\n\n.. image:: https://img.shields.io/github/downloads/pyOpenSci/pyEnsemblRest/total.svg?maxAge=2592000\n :target: https://github.com/pyOpenSci/pyEnsemblRest\n :alt: GitHub Downloads\n\n.. image:: https://img.shields.io/pypi/dd/pyensemblrest.svg?maxAge=2592000\n :target: https://img.shields.io/pypi/dd/pyensemblrest.svg?maxAge=2592000\n :alt: PyPi Downloads\n \nLicense\n=======\n\npyEnsemblRest - A wrapper for the EnsEMBL REST API\n\nCopyright (C) 2013-2016, Steve Moss\n\npyEnsemblRest is free software: you can redistribute it and/or modify\nit under the terms of the GNU General Public License as published by\nthe Free Software Foundation, either version 3 of the License, or\n(at your option) any later version.\n\npyEnsemblRest is distributed in the hope that it will be useful,\nbut WITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\nGNU General Public License for more details.\n\nYou should have received a copy of the GNU General Public License\nalong with pyEnsemblRest. If not, see .\n\n\nInstallation\n============\n\nUsing pip\n---------\n\nSimply type:\n\n.. code:: bash\n\n pip install pyensemblrest\n\n\nFrom source\n-----------\n\nClone the pyEnsemblRest then install package from source:\n\n.. code:: bash\n\n git clone https://github.com/pyOpenSci/pyEnsemblRest.git\n cd pyEnsemblRest\n sudo python setup.py install\n\nUsage\n=====\n\nTo import an setup a new EnsemblRest object you should do the following:\n\n.. code:: python\n\n from ensemblrest import EnsemblRest\n ensRest = EnsemblRest()\n\nEnsemblRest() istance points to http://rest.ensembl.org/ . In order to use EnsemblGenome, you can import a different object:\n\n.. code:: python\n\n from ensemblrest import EnsemblGenomeRest\n ensGenomeRest = EnsemblGenomeRest()\n\nOr, as an alternative, you can give a different base url during EnsemblRest class instantiation:\n\n.. code:: python\n\n from ensemblrest import EnsemblRest\n ensGenomeRest = EnsemblRest(base_url='http://rest.ensemblgenomes.org')\n\nTo use a custom EnsEMBL REST server you should setup the EnsemblRest as the precedent way:\n\n.. code:: python\n\n from ensemblrest import EnsemblRest\n # setup rest object to point to localhost server. The 3000 stands for REST default port\n ensRest = EnsemblRest(base_url='http://localhost:3000')\n\nYou may also provide proxy server settings in the form of a dict, as follows:\n\n.. code:: python\n\n from ensemblrest import EnsemblRest\n # setup rest object to point to a proxy server\n ensRest = EnsemblRest(proxies={'http':'proxy.address.com:3128', 'https':'proxy.address.com:3128'})\n\nEnsEMBL has a rate-limit policy to deal with requests. You can do up to 15 requests per second. You could wait a little during your requests:\n\n.. code:: python\n\n from time import sleep\n # sleep for a second so we don't get rate-limited\n sleep(1)\n\nAlternatively this library verifies and limits your requests to 15 requests per second. Avoid to run different python processes to get your data, otherwise you will be blacklisted by ensembl team. If you have to do a lot or requests, consider to use POST supported endpoints, or contact ensembl team to add POST support to endpoints of your interest.\n\nGET endpoints\n-------------\n\nEnsemblRest and EnsemblGenomeRest class methods are not defined in libraries, so you cannot see docstring using help() method on python or ipython terminal. However you can see all methods available for ensembl_ and ensemblgenomes_ rest server once class is instantiate. To get help on a particular method, please refer to ensembl help documentation on different endpoints in the ensembl_ and ensemblgenomes_ rest service. Please note that endpoints on ensembl_ may be different from ensemblgenomes_ endpoints.\nIf you look, for example, at sequence_ endpoint documentation, you will find optional and required parameters. Required parameters must be specified in order to work properly, otherwise you will get an exception. Optional parameters may be specified or not, depending on your request. In all cases parameter name are the same used in documentation. For example to get data using sequence_ endpoint, you must specify at least required parameters:\n\n.. code:: python\n\n seq = ensRest.getSequenceById(id='ENSG00000157764')\n\nIn order to mask sequence and to expand the 5' UTR you may set optional parameters using the same name described in documentation:\n\n.. code:: python\n\n seq = ensRest.getSequenceById(id='ENSG00000157764', mask=\"soft\", expand_5prime=1000)\n\nMultiple values for a certain parameters (for GET methods) can be submitted in a list. For example, to get the same results of\n\n.. code:: bash\n\n curl 'http://rest.ensembl.org/overlap/region/human/7:140424943-140624564?feature=gene;feature=transcript;feature=cds;feature=exon' -H 'Content-type:application/json'\n\nas described in `overlap region`_ GET endpoint, you can use the following function:\n\n.. code:: python\n\n data = ensRest.getOverlapByRegion(species=\"human\", region=\"7:140424943-140624564\", feature=[\"gene\", \"transcript\", \"cds\", \"exon\"])\n\n.. _overlap region: http://rest.ensembl.org/documentation/info/overlap_region\n\nPOST endpoints\n--------------\n\nPOST endpoints can be used as the GET endpoints, the only difference is that they support parameters in python list in order to perform multiple queries on the same ensembl endpoint. The parameters name are the same used in documentation, for example we can use the `POST sequence`_ endpoint in such way:\n\n.. code:: python\n\n seqs = ensRest.getSequenceByMultipleIds(ids=[\"ENSG00000157764\", \"ENSG00000248378\" ])\n\nwhere the example value ``{ \"ids\" : [\"ENSG00000157764\", \"ENSG00000248378\" ] }`` is converted in the non-positional argument ``ids=[\"ENSG00000157764\", \"ENSG00000248378\" ]``. As the previous example, we can add optional parameters:\n\n.. code:: python\n\n seqs = ensRest.getSequenceByMultipleIds(ids=[\"ENSG00000157764\", \"ENSG00000248378\"], mask=\"soft\")\n\nChange the default Output format\n--------------------------------\n\nYou can change the default output format by passing a supported ``Content-type`` using\nthe ``content_type`` parameter, for example:\n\n.. code:: python\n\n plain_xml = ensRest.getArchiveById(id='ENSG00000157764', content_type=\"text/xml\")\n\nFor a complete list of supported ``Content-type`` see `Supported MIME Types`_ from\nensembl REST documentation. You need also to check if the same ``Content-type``\nis supported in the EnsEMBL endpoint description.\n\n.. _Supported MIME Types: https://github.com/Ensembl/ensembl-rest/wiki/Output-formats#supported-mime-types\n\nRate limiting\n-------------\n\nSometime you can be rate limited if you are querying EnsEMBL REST services with more than one concurrent processes, or by `sharing ip addresses`_. In such case, you can have a message like this:\n\n.. _sharing ip addresses: https://github.com/Ensembl/ensembl-rest/wiki#example-clients\n\n.. code:: bash\n\n ensemblrest.exceptions.EnsemblRestRateLimitError: EnsEMBL REST API returned a 429 (Too Many Requests): You have been rate-limited; wait and retry. The headers X-RateLimit-Reset, X-RateLimit-Limit and X-RateLimit-Remaining will inform you of how long you have until your limit is reset and what that limit was. If you get this response and have not exceeded your limit then check if you have made too many requests per second. (Rate limit hit: Retry after 2 seconds)\n\nEven if this library tries to do 15 request per seconds, you should avoid to run multiple\nEnsEMBL REST clients. To deal which such problem without interrupting your code, try\nto deal with the exception; For example:\n\n.. code:: python\n\n # import required modules\n import os\n import sys\n import time\n import logging\n\n # get ensembl REST modules and exception\n from ensemblrest import EnsemblRest\n from ensemblrest import EnsemblRestRateLimitError\n\n # An useful way to defined a logger lever, handler, and formatter\n logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=logging.INFO)\n logger = logging.getLogger(os.path.basename(sys.argv[0]))\n\n # setup a new EnsemblRest object\n ensRest = EnsemblRest()\n\n # Get a request and deal with retry_after. Set a maximum number of retries (don't\n # try to do the same request forever or you will be banned from ensembl!)\n attempt = 0\n max_attempts = 3\n\n while attempt < max_attempts:\n # update attempt count\n attempt += 1\n\n try:\n result = ensRest.getLookupById(id='ENSG00000157764')\n # exit while on success\n break\n\n # log exception and sleep a certain amount of time (sleeping time increases at each step)\n except EnsemblRestRateLimitError, message:\n logger.warn(message)\n time.sleep(ensRest.retry_after*attempt)\n\n finally:\n if attempt >= max_attempts:\n raise Exception(\"max attempts exceeded (%s)\" %(max_attempts))\n\n sys.stdout.write(\"%s\\n\" %(result))\n sys.stdout.flush()\n\nMethods list\n------------\n\nHere is a list of all methods defined. Methods called by ``ensRest`` instance are specific to ensembl_ rest server, while methods called via ``ensGenomeRest`` instance are specific of ensemblgenomes_ rest server.\n\nTo access the *Archive* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getArchiveById(id=\"ENSG00000157764\")\n print ensRest.getArchiveByMultipleIds(id=[\"ENSG00000157764\", \"ENSG00000248378\"])\n\nTo access the *Comparative Genomics* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensGenomeRest.getGeneFamilyById(id=\"MF_01687\", compara=\"bacteria\")\n print ensGenomeRest.getGeneFamilyMemberById(id=\"b0344\", compara=\"bacteria\")\n print ensGenomeRest.getGeneFamilyMemberBySymbol(symbol=\"lacZ\", species=\"escherichia_coli_str_k_12_substr_mg1655\", compara=\"bacteria\")\n # Change the returned content type to \"Newick\" format\n print ensRest.getGeneTreeById(id='ENSGT00390000003602', nh_format=\"simple\", content_type=\"text/x-nh\")\n print ensRest.getGeneTreeMemberById(id='ENSG00000157764')\n print ensRest.getGeneTreeMemberBySymbol(species='human', symbol='BRCA2')\n print ensRest.getAlignmentByRegion(species=\"taeniopygia_guttata\", region=\"2:106040000-106040050:1\", species_set_group=\"sauropsids\")\n print ensRest.getHomologyById(id='ENSG00000157764')\n print ensRest.getHomologyBySymbol(species='human', symbol='BRCA2')\n\nTo access the *Cross References* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getXrefsById(id='ENSG00000157764')\n print ensRest.getXrefsByName(species='human', name='BRCA2')\n print ensRest.getXrefsBySymbol(species='human', symbol='BRCA2')\n\nTo access the *Information* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getInfoAnalysis(species=\"homo_sapiens\")\n print ensRest.getInfoAssembly(species=\"homo_sapiens\", bands=1) #bands is an optional parameter\n print ensRest.getInfoAssemblyRegion(species=\"homo_sapiens\", region_name=\"X\")\n print ensRest.getInfoBiotypes(species=\"homo_sapiens\")\n print ensRest.getInfoComparaMethods()\n print ensRest.getInfoComparaSpeciesSets(methods=\"EPO\")\n print ensRest.getInfoComparas()\n print ensRest.getInfoData()\n print ensGenomeRest.getInfoEgVersion()\n print ensRest.getInfoExternalDbs(species=\"homo_sapiens\")\n print ensGenomeRest.getInfoDivisions()\n print ensGenomeRest.getInfoGenomesByName(name=\"campylobacter_jejuni_subsp_jejuni_bh_01_0142\")\n\n #This response is very heavy\n #print ensGenomeRest.getInfoGenomes()\n\n print ensGenomeRest.getInfoGenomesByAccession(division=\"U00096\")\n print ensGenomeRest.getInfoGenomesByAssembly(division=\"GCA_000005845\")\n print ensGenomeRest.getInfoGenomesByDivision(division=\"EnsemblPlants\")\n print ensGenomeRest.getInfoGenomesByTaxonomy(division=\"Arabidopsis\")\n print ensRest.getInfoPing()\n print ensRest.getInfoRest()\n print ensRest.getInfoSoftware()\n print ensRest.getInfoSpecies(division=\"ensembl\")\n print ensRest.getInfoVariation(species=\"homo_sapiens\")\n # Restrict populations returned to e.g. only populations with LD data. It is highly recommended\n # to set a filter and to avoid loading the complete list of populations.\n print ensRest.getInfoVariationPopulations(species=\"homo_sapiens\", filter=\"LD\")\n\nTo access the *Linkage Disequilibrium* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getLdId(species=\"human\", id=\"rs1042779\", population_name=\"1000GENOMES:phase_3:KHV\", window_size=500, d_prime=1.0)\n print ensRest.getLdPairwise(species=\"human\", id1=\"rs6792369\", id2=\"rs1042779\")\n print ensRest.getLdRegion(species=\"human\", region=\"6:25837556..25843455\", population_name=\"1000GENOMES:phase_3:KHV\")\n\nTo access the *Lookup* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getLookupById(id='ENSG00000157764')\n print ensRest.getLookupByMultipleIds(ids=[\"ENSG00000157764\", \"ENSG00000248378\" ])\n print ensRest.getLookupBySymbol(species=\"homo_sapiens\", symbol=\"BRCA2\", expand=1)\n print ensRest.getLookupByMultipleSymbols(species=\"homo_sapiens\", symbols=[\"BRCA2\", \"BRAF\"])\n\nTo access the *Mapping* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getMapCdnaToRegion(id='ENST00000288602', region='100..300')\n print ensRest.getMapCdsToRegion(id='ENST00000288602', region='1..1000')\n print ensRest.getMapAssemblyOneToTwo(species='human', asm_one='NCBI36', region='X:1000000..1000100:1', asm_two='GRCh37')\n print ensRest.getMapTranslationToRegion(id='ENSP00000288602', region='100..300')\n\nTo access the *Ontologies and Taxonomy* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getAncestorsById(id='GO:0005667')\n print ensRest.getAncestorsChartById(id='GO:0005667')\n print ensRest.getDescendantsById(id='GO:0005667')\n print ensRest.getOntologyById(id='GO:0005667')\n print ensRest.getOntologyByName(name='transcription factor complex')\n print ensRest.getTaxonomyClassificationById(id='9606')\n print ensRest.getTaxonomyById(id='9606')\n print ensRest.getTaxonomyByName(name=\"Homo%25\")\n\nTo access the *Overlap* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getOverlapById(id=\"ENSG00000157764\", feature=\"gene\")\n print ensRest.getOverlapByRegion(species=\"human\", region=\"7:140424943-140624564\", feature=\"gene\")\n print ensRest.getOverlapByTranslation(id=\"ENSP00000288602\")\n\nTo access the *Regulation* endpoints you can use the following method:\n\n.. code:: python\n\n print ensRest.getRegulatoryFeatureById(species=\"homo_sapiens\", id=\"ENSR00001348195\")\n\nTo access the *Sequences* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getSequenceById(id='ENSG00000157764')\n print ensRest.getSequenceByMultipleIds(ids=[\"ENSG00000157764\", \"ENSG00000248378\" ])\n print ensRest.getSequenceByRegion(species='human', region='X:1000000..1000100')\n print ensRest.getSequenceByMultipleRegions(species=\"homo_sapiens\", regions=[\"X:1000000..1000100:1\", \"ABBA01004489.1:1..100\"])\n\nTo access the *Transcript Haplotypes* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getTranscripsHaplotypes(species=\"homo_sapiens\", id=\"ENST00000288602\")\n\nTo access the *VEP* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getVariantConsequencesByHGVSnotation(species=\"human\", hgvs_notation=\"AGT:c.803T>C\")\n print ensRest.getVariantConsequencesById(species='human', id='COSM476')\n print ensRest.getVariantConsequencesByMultipleIds(species=\"human\", ids=[ \"rs116035550\", \"COSM476\" ])\n print ensRest.getVariantConsequencesByRegion(species='human', region='9:22125503-22125502:1', allele='C')\n print ensRest.getVariantConsequencesByMultipleRegions(species=\"human\", variants=[\"21 26960070 rs116645811 G A . . .\", \"21 26965148 rs1135638 G A . . .\" ] )\n\nTo access the *Variation* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.getVariationById(id=\"rs56116432\", species=\"homo_sapiens\")\n print ensRest.getVariationByMultipleIds(ids=[\"rs56116432\", \"COSM476\" ], species=\"homo_sapiens\")\n\nTo access the *Variation GA4GH* endpoints you can use the following methods:\n\n.. code:: python\n\n print ensRest.searchGA4GHCallSet(variantSetId=1, pageSize=2)\n print ensRest.getGA4GHCallSetById(id=\"1:NA19777\")\n print ensRest.searchGA4GHDataset(pageSize=3)\n print ensRest.getGA4GHDatasetById(id=\"6e340c4d1e333c7a676b1710d2e3953c\")\n print ensRest.getGA4GHVariantsById(id=\"1:rs1333049\")\n print ensRest.searchGA4GHVariants(variantSetId=1, referenceName=22, start=17190024, end=17671934, pageToken=\"\", pageSize=1)\n print ensRest.searchGA4GHVariantsets(datasetId=\"6e340c4d1e333c7a676b1710d2e3953c\", pageToken=\"\", pageSize=2)\n print ensRest.getGA4GHVariantsetsById(id=1)\n print ensRest.searchGA4GHReferences(referenceSetId=\"GRCh38\", pageSize=10)\n print ensRest.getGA4GHReferencesById(id=\"9489ae7581e14efcad134f02afafe26c\")\n print ensRest.searchGA4GHReferenceSets()\n print ensRest.getGA4GHReferenceSetsById(id=\"GRCh38\")\n\n\n.. _ensembl: http://rest.ensembl.org/\n.. _ensemblgenomes: http://rest.ensemblgenomes.org/\n.. _sequence: http://rest.ensembl.org/documentation/info/sequence_id\n.. _POST sequence: http://rest.ensembl.org/documentation/info/sequence_id_post\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/pyopensci/pyensemblrest/tree/master", "keywords": "ensembl python rest api", "license": "GPLv3", "maintainer": "", "maintainer_email": "", "name": "pyensemblrest", "package_url": "https://pypi.org/project/pyensemblrest/", "platform": "", "project_url": "https://pypi.org/project/pyensemblrest/", "project_urls": { "Homepage": "http://github.com/pyopensci/pyensemblrest/tree/master" }, "release_url": "https://pypi.org/project/pyensemblrest/0.2.3/", "requires_dist": null, "requires_python": "", "summary": "An easy way to access EnsEMBL data with Python.", "version": "0.2.3" }, "last_serial": 3601923, "releases": { "0.1.7b": [ { "comment_text": "", "digests": { "md5": "2f6608cce21240d7993208948afbdcad", "sha256": "4c2b5955f11db5386c59a9f464763d06cf4557ccc89064fb02d31a6cec8a3f58" }, "downloads": -1, "filename": "pyensemblrest-0.1.7b.tar.gz", "has_sig": false, "md5_digest": "2f6608cce21240d7993208948afbdcad", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6185, "upload_time": "2013-05-13T18:27:17", "url": "https://files.pythonhosted.org/packages/31/aa/5cef949c37cd7551beebbc42002609a27a9dc447b425ff59c2f871e76a4b/pyensemblrest-0.1.7b.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "18131756d783ed25f7a38a6858edfe72", "sha256": "a47198216abcf27861b7a038f7a73b4b328ba0e3ddc6e4d56fbc4ef033d4beba" }, "downloads": -1, "filename": "pyensemblrest-0.2.2.tar.gz", "has_sig": false, "md5_digest": "18131756d783ed25f7a38a6858edfe72", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 22871, "upload_time": "2016-06-13T14:12:32", "url": "https://files.pythonhosted.org/packages/ad/cc/39c6641469e914e083393662cb0b5f136d5f34eb4f1a0834f0107be8cf5b/pyensemblrest-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "61c57cf690eb4dd2bf54572a3d35841f", "sha256": "a547bb7c6c45ac0466f82e18831652c378ba16a1a0efc85781ff14c1b058be9e" }, "downloads": -1, "filename": "pyensemblrest-0.2.3.tar.gz", "has_sig": false, "md5_digest": "61c57cf690eb4dd2bf54572a3d35841f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 31565, "upload_time": "2018-02-21T12:03:50", "url": "https://files.pythonhosted.org/packages/f2/91/714fbcaf508fab71ae7da647d2713e515cfc47dbd4d2de442ffe9b3ec1c4/pyensemblrest-0.2.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "61c57cf690eb4dd2bf54572a3d35841f", "sha256": "a547bb7c6c45ac0466f82e18831652c378ba16a1a0efc85781ff14c1b058be9e" }, "downloads": -1, "filename": "pyensemblrest-0.2.3.tar.gz", "has_sig": false, "md5_digest": "61c57cf690eb4dd2bf54572a3d35841f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 31565, "upload_time": "2018-02-21T12:03:50", "url": "https://files.pythonhosted.org/packages/f2/91/714fbcaf508fab71ae7da647d2713e515cfc47dbd4d2de442ffe9b3ec1c4/pyensemblrest-0.2.3.tar.gz" } ] }