{
"info": {
"author": "Konstantin Tretyakov",
"author_email": "kt@ut.ee",
"bugtrack_url": null,
"classifiers": [
"Development Status :: 3 - Alpha",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python",
"Topic :: Scientific/Engineering :: Bio-Informatics"
],
"description": "=======================================================================\nPyENCODE: Convenience package for accessing ENCODE project data at UCSC\n=======================================================================\n\nThis is a convenience package for accessing the raw data of the `ENCODE (Encyclopedia of DNA Elements) project `_.\n\nThe raw ENCODE files are organized in a fairly straightforward structure under `this URL `_. The files are divided into collections (\"composites\"), each in its own subdirectory. The subdirectory of each collection keeps the metadata of all files in a text file named ``files.txt``. For example, the `genome segmentation `_ data is kept under ``ROOT_URL/wgEncodeAwgSegmentation``. In particular, the segmentation obtained using ``Combined`` method on the ``K562`` cells is kept in a compressed ``BED``-file named ``ROOT_URL/wgEncodeAwgSegmentation/wgEncodeAwgSegmentationCombinedK562.bed.gz``.\n\nIn principle, downloading and reading the file is rather straightforward. What this package offers in addition is a slightly more streamlined way of listing files, caching them, and reading file metadata. For example, the following code will download the abovementioned file into cache, then open it and index as an interval tree::\n\n >> from pyencode import Encode\n >> e = Encode(cache_dir = 'wgEncode')\n >> gtree = e.AwgSegmentation.CombinedK562.fetch().read_as_intervaltree()\n \nAs another example, here is how to list and pre-download all files in the ``AwgSegmentation`` collection into cache::\n\n >> for f in e.AwgSegmentation:\n >> print(\"%s-%s\" % (f['cell'], f['dataType']))\n >> f.fetch()\n\nInstallation\n------------\n\nThe easiest way to install most Python packages is via ``easy_install`` or ``pip``::\n\n $ pip install PyENCODE\n\nUsage\n-----\n\nThe main object, provided by the package is ``pyencode.Encode``. You create an instance, specifying the root of a cache directory::\n\n >> from pyencode import Encode\n >> e = Encode(cache_dir = 'wgEncode')\n\nThe default value for ``cache_dir`` is ``~/.pyencode``. The resulting object works as a dictionary, with keys being the different file collections within ENCODE::\n\n >> c['AwgSegmentation']\n \nAlternatively, you can use field names instead of dictionary keys, i.e. ``e['AwgSegmentation']`` is the same as ``e.AwgSegmentation``. To iterate over all collections, simply do::\n\n >> for c in e:\n >> print(c.name)\n\nEach element of the ``Encode`` object is a ``EncodeCollection`` object, which is acts as a collection of ``EncodeFile`` elements::\n\n >> for f in e.AwgSegmentation:\n >> print(f.name)\n\nSimiarly, dictionary-style or field name access can be used to retrieve files in a collection: ``e.AwgSegmentation['CombinedK562']`` or ``e.AwgSegmentation.CombinedK562``.\n\nEach ``EncodeFile`` is a dictionary of file metadata fields::\n\n >> print(e.AwgSegmentation.CombinedK562['cell'])\n\nIn addition, ``EncodeFile`` provides a set of convenience fields and methods:\n\n * ``fetch(force=False)`` - Download file into cache. Returns the ``EncodeFile`` object for convenient chaining of calls. When``force`` is ``False``, file will not be redownloaded if already in cache.\n * ``keys()`` - Set of all file attributes that can be accessed via ``[]``.\n * ``url`` - Return the URL of the file online.\n * ``local_url`` - The URL of the cached copy. It is not guaranteed that the file exists, so it is often more practical to do ``.fetch().local_url``.\n * ``local_path`` - Return the path of the locally cached copy. It is not guaranteed that the file exists. \n * ``open()`` - Open the file in binary mode for reading. If the file is not in cache, it is *not* downloaded to cache and opened from the web (so, it is often more practical to do ``.fetch().open()``).\n * ``open_text()`` - Open the file in text mode for reading. If the file is not in cache it is *not* downloaded to cache and opened from the web. If the file is a `.gz` file, it is automatically unpacked (i.e. the returned file instance is an opened `GzipFile`).\n * ``read_as_intervaltree()`` - Read a ``BED`` file into an ``intervaltree.bio.GenomeIntervalTree`` data structure. Simiarly, if the file is not in cache, it is not automatically downloaded.\n\nNote that ``Encode`` is not safe for doing multithreading or multiprocessing, unless all the necessary files are already cached.\n\n\nCopyright & License\n-------------------\n\nCopyright 2014, Konstantin Tretyakov\n\nMIT License.\n\nSubmit issues and fixes at https://github.com/konstantint/PyENCODE",
"description_content_type": null,
"docs_url": null,
"download_url": "UNKNOWN",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "http://kt.era.ee/",
"keywords": "bioinformatics data-access encode",
"license": "MIT",
"maintainer": null,
"maintainer_email": null,
"name": "PyENCODE",
"package_url": "https://pypi.org/project/PyENCODE/",
"platform": "UNKNOWN",
"project_url": "https://pypi.org/project/PyENCODE/",
"project_urls": {
"Download": "UNKNOWN",
"Homepage": "http://kt.era.ee/"
},
"release_url": "https://pypi.org/project/PyENCODE/0.2/",
"requires_dist": null,
"requires_python": null,
"summary": "Convenience package for accessing ENCODE (Encyclopedia of DNA Elements) project data at UCSC",
"version": "0.2"
},
"last_serial": 1398836,
"releases": {
"0.1": [
{
"comment_text": "",
"digests": {
"md5": "cfbb501ef41f43e292b61b732ed42f77",
"sha256": "ffd7d60cdd0424964bd192b3e1fc8ea787c11d73591ac0bf3bb4b0e3650a1d8a"
},
"downloads": -1,
"filename": "PyENCODE-0.1.zip",
"has_sig": false,
"md5_digest": "cfbb501ef41f43e292b61b732ed42f77",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 17795,
"upload_time": "2014-06-17T21:22:56",
"url": "https://files.pythonhosted.org/packages/37/da/ed3681d668a9a7e9d5a0ae41bdf66848fb7918d415adca7cf22166e9617f/PyENCODE-0.1.zip"
}
],
"0.2": [
{
"comment_text": "",
"digests": {
"md5": "98b771ef844d38f06ff1d35afe3b8a29",
"sha256": "4d7d95b4e3ec761dc42ef365188e33bc5bdc3e8f4c55c1a692ba25417a26b149"
},
"downloads": -1,
"filename": "PyENCODE-0.2.tar.gz",
"has_sig": false,
"md5_digest": "98b771ef844d38f06ff1d35afe3b8a29",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10349,
"upload_time": "2015-01-27T17:14:04",
"url": "https://files.pythonhosted.org/packages/78/53/71141179b769ef53190e80a8927533c8eaaa1283bb58379f62c32a2d6381/PyENCODE-0.2.tar.gz"
}
]
},
"urls": [
{
"comment_text": "",
"digests": {
"md5": "98b771ef844d38f06ff1d35afe3b8a29",
"sha256": "4d7d95b4e3ec761dc42ef365188e33bc5bdc3e8f4c55c1a692ba25417a26b149"
},
"downloads": -1,
"filename": "PyENCODE-0.2.tar.gz",
"has_sig": false,
"md5_digest": "98b771ef844d38f06ff1d35afe3b8a29",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10349,
"upload_time": "2015-01-27T17:14:04",
"url": "https://files.pythonhosted.org/packages/78/53/71141179b769ef53190e80a8927533c8eaaa1283bb58379f62c32a2d6381/PyENCODE-0.2.tar.gz"
}
]
}