{ "info": { "author": "Han Lin", "author_email": "hotdogee@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Natural Language :: English", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4" ], "description": "===============================\ngff3-py\n===============================\n\n.. image:: https://travis-ci.org/hsiaoyi0504/gff3-py.svg?branch=master\n :target: https://travis-ci.org/hsiaoyi0504/gff3-py\n\n\nManipulate genomic features and validate the syntax and reference sequence of your |GFF3|_ files.\n\n* Free software: NAL Public Domain License\n* Documentation: https://gff3-py.readthedocs.org.\n\nFeatures\n--------\n\n* **Simple data structures**: Parses a |GFF3|_ file into a structure composed of simple python |dict|_ and |list|_.\n* **Validation**: Validates the |GFF3|_ syntax on parse, and saves the error messages in the parsed structure.\n* **Best effort parsing**: Despite any detected errors, continue to parse the whole file and make as much sense to it as possible.\n* Uses the python |logging|_ library to log error messages with support for custom loggers.\n* Parses embeded or external |FASTA|_ sequences to check bounds and number of ``N`` s.\n* Check and correct the phase for ``CDS`` features.\n* Tree traversal methods ``ancestors`` and ``descendants`` returns a simple ``list`` in Breadth-first search order.\n* Transfer children and parents using the ``adopt`` and ``adopted`` methods.\n* Test for overlapping features using the ``overlap`` method.\n* Remove a feature and its associated features using the ``remove`` method.\n* Write the modified structure to a GFF3 file using the ``write`` mthod.\n\nQuick Start\n-----------\n\nAn example that just parses a GFF3 file named ``annotations.gff`` and validates it \nusing an external FASTA file named ``annotations.fa`` looks like:\n\n.. code:: python\n\n # validate.py\n # ============\n from gff3 import Gff3\n\n # initialize a Gff3 object\n gff = Gff3()\n # parse GFF3 file and do syntax checking, this populates gff.lines and gff.features\n # if an embedded ##FASTA directive is found, parse the sequences into gff.fasta_embedded\n gff.parse('annotations.gff')\n # parse the external FASTA file into gff.fasta_external\n gff.parse_fasta_external('annotations.fa')\n # Check seqid, bounds and the number of Ns in each feature using one or more reference sources\n gff.check_reference(allowed_num_of_n=0, feature_types=['CDS'])\n # Checks whether child features are within the coordinate boundaries of parent features\n gff.check_parent_boundary()\n # Calculates the correct phase and checks if it matches the given phase for CDS features\n gff.check_phase()\n \nA more feature complete GFF3 validator with a command line interface which also generates validation\nreport in MarkDown is available under ``examples/gff_valid.py``\n\nThe following example demonstrates how to filter, tranverse, and modify the parsed gff3 ``lines`` list.\n\n1. Change features with type ``exon`` to ``pseudogenic_exon`` and type ``transcript`` to ``pseudogenic_transcript`` if the feature has an ancestor of type ``pseudogene``\n\n2. If a ``pseudogene`` feature overlaps with a ``gene`` feature, move all of the children from the ``pseudogene`` feature to the ``gene`` feature, and remove the ``pseudogene`` feature.\n\n.. code:: python\n\n # fix_pseudogene.py\n # =================\n from gff3 import Gff3\n gff = Gff3('annotations.gff')\n type_map = {'exon': 'pseudogenic_exon', 'transcript': 'pseudogenic_transcript'}\n pseudogenes = [line for line in gff.lines if line['line_type'] == 'feature' and line['type'] == 'pseudogene']\n for pseudogene in pseudogenes:\n # convert types\n for line in gff.descendants(pseudogene):\n if line['type'] in type_map:\n line['type'] = type_map[line['type']]\n # find overlapping gene\n overlapping_genes = [line for line in gff.lines if line['line_type'] == 'feature' and line['type'] == 'gene' and gff.overlap(line, pseudogene)]\n if overlapping_genes:\n # move pseudogene children to overlapping gene\n gff.adopt(pseudogene, overlapping_genes[0])\n # remove pseudogene\n gff.remove(pseudogene)\n gff.write('annotations_fixed.gff')\n\n.. |GFF3| replace:: ``GFF3``\n.. |dict| replace:: ``dict``\n.. |list| replace:: ``list``\n.. |logging| replace:: ``logging``\n.. |FASTA| replace:: ``FASTA``\n\n.. _GFF3: http://www.sequenceontology.org/gff3.shtml\n.. _dict: https://docs.python.org/2/tutorial/datastructures.html#dictionaries\n.. _list: https://docs.python.org/2/tutorial/datastructures.html#more-on-lists\n.. _logging: https://docs.python.org/2/library/logging.html\n.. _FASTA: http://en.wikipedia.org/wiki/FASTA_format\n\n\n\n\nHistory\n-------\n\n\n\n1.0.0 (2018-12-01)\n---------------------\n\n* Fix python3 problem\n* Added sequence functions: complement(seq) and translate(seq)\n* Added fasta write function: fasta_dict_to_file(fasta_dict, fasta_file, line_char_limit=None)\n* Added Gff method to return the sequence of line_data: sequence(self, line_data, child_type=None, reference=None)\n* Gff.write no longer prints redundent '###' when the whole gene is marked as removed\n\n\n0.3.0 (2015-03-10)\n---------------------\n\n* Fixed phase checking.\n\n0.2.0 (2015-01-28)\n---------------------\n\n* Supports python 2.6, 2.7, 3.3, 3.4, pypy.\n* Don't report empty attributes as errors.\n* Improved documentation.\n\n0.1.0 (2014-12-11)\n---------------------\n\n* First release on PyPI.", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/hotdogee/gff3-py", "keywords": "gff3", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "gff3", "package_url": "https://pypi.org/project/gff3/", "platform": "", "project_url": "https://pypi.org/project/gff3/", "project_urls": { "Homepage": "https://github.com/hotdogee/gff3-py" }, "release_url": "https://pypi.org/project/gff3/1.0.0/", "requires_dist": null, "requires_python": "", "summary": "Manipulate genomic features and validate the syntax and reference sequence of your GFF3 files.", "version": "1.0.0" }, "last_serial": 4549723, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "059b0d396fbe2bfa6863cc3d4c7ccbe7", "sha256": "664f5f8fcd629d9fdc465fc2f62dd6179fdd0f2541dd53ea643b4e30e84315ed" }, "downloads": -1, "filename": "gff3-0.1.0.zip", "has_sig": false, "md5_digest": "059b0d396fbe2bfa6863cc3d4c7ccbe7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27215, "upload_time": "2014-12-11T17:49:32", "url": "https://files.pythonhosted.org/packages/a9/e0/cfbde177f710ab9010c3f49e25deecc4a4b2274da4f271c9ac43fd4eb182/gff3-0.1.0.zip" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "ad26ee116ad2ffd4308b98ace67bdc1a", "sha256": "2eed7da1c40793926cc067a47aed6ebd423c59ceda63845a18441a1db1200b6d" }, "downloads": -1, "filename": "gff3-0.2.0.zip", "has_sig": false, "md5_digest": "ad26ee116ad2ffd4308b98ace67bdc1a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 32121, "upload_time": "2015-01-28T17:42:39", "url": "https://files.pythonhosted.org/packages/03/ba/5a870777f9130a87302edc35b56f19125a00de599501292858f242d583f0/gff3-0.2.0.zip" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "c648bdb5ddb04fa38f28a07ef52abe64", "sha256": "14bd20f0fd94386a2848f1af768c2af1bb6dfc9bdc4e433dff0bbc7c15abc245" }, "downloads": -1, "filename": "gff3-0.3.0.zip", "has_sig": false, "md5_digest": "c648bdb5ddb04fa38f28a07ef52abe64", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 32472, "upload_time": "2015-03-10T18:32:11", "url": "https://files.pythonhosted.org/packages/e1/9d/22b29188092d5c51e80181b6e0e880cf04fafb4387bd84a7d8c16b7cfe66/gff3-0.3.0.zip" } ], "1.0.0": [ { "comment_text": "", "digests": { "md5": "efb69c56451ea864082e445c25756507", "sha256": "7a1d22e5d35e57af8c1ff901e933c46ba7781e2ff83d3dd02871b9bcaec503c6" }, "downloads": -1, "filename": "gff3-1.0.0.macosx-10.14-x86_64.tar.gz", "has_sig": false, "md5_digest": "efb69c56451ea864082e445c25756507", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37882, "upload_time": "2018-12-01T07:12:36", "url": "https://files.pythonhosted.org/packages/85/9d/7a2c175b06a9c97360f1dc93e168f229e19c13c930e494761a0696e2f7c5/gff3-1.0.0.macosx-10.14-x86_64.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "efb69c56451ea864082e445c25756507", "sha256": "7a1d22e5d35e57af8c1ff901e933c46ba7781e2ff83d3dd02871b9bcaec503c6" }, "downloads": -1, "filename": "gff3-1.0.0.macosx-10.14-x86_64.tar.gz", "has_sig": false, "md5_digest": "efb69c56451ea864082e445c25756507", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37882, "upload_time": "2018-12-01T07:12:36", "url": "https://files.pythonhosted.org/packages/85/9d/7a2c175b06a9c97360f1dc93e168f229e19c13c930e494761a0696e2f7c5/gff3-1.0.0.macosx-10.14-x86_64.tar.gz" } ] }