{ "info": { "author": "Henrik Blidh", "author_email": "henrik.blidh@nedomkull.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Operating System :: OS Independent", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Topic :: Text Processing :: Markup :: XML" ], "description": "xmlr\n====\n\n|Build Status| |Coverage Status|\n\nIt can be problematic to handle large XML files (>> 10 MB) and using the\n``xml`` module in Python directly leads to huge memory overheads. Most\noften, these large XML files are pure data files, storing highly\nstructured data that have no intrinsic need to be stored in XML.\n\nThis package provides iterative methods for dealing with them, reading\nthe XML documents into Python dict representation instead, according to\nmethodology specifed on the page `Converting Between XML and JSON\n`_. ``xmlr`` is inspired by the solutions\ndescribed in the blog posts `High-performance XML parsing in Python with lxml\n`_ and\n`Parsing large XML files, serially, in Python\n`_,\nenabling the parsing of very large documents without problems with\novertaxing the memory.\n\n.. pull-quote::\n\n This package generally provides a one way trip; there is not necessarily\n a bijectional relation with the XML source after parsing.\n\nInstallation\n------------\n\n::\n\n pip install xmlr\n\nUsage\n-----\n\nTo parse an entire document, use the ``xmlparse`` method:\n\n.. code:: python\n\n from xmlr import xmlparse\n\n doc = xmlparse('very_large_doc.xml')\n\nAn iterator, ``xmliter``, yielding elements of a specified type as they\nare parsed from the document is also present:\n\n.. code:: python\n\n from xmlr import xmliter\n\n for d in xmliter('very_large_record.xml', 'Record'):\n print(d)\n\nThe desired parser can also be specified. Available methods are:\n\n- ``ELEMENTTREE`` - Using ``xml.etree.ElementTree`` as backend.\n- ``C_ELEMENTTREE`` - Using ``xml.etree.cElementTree`` as backend.\n- ``LXML_ELEMENTTREE`` - Using ``lxml.etree`` as backend. Requires\n installation of the ``lxml`` package.\n\nThese can then be used like this:\n\n.. code:: python\n\n from xmlr import xmliter, XMLParsingMethods\n\n for d in xmliter('very_large_record.xml', 'Record', parser=XMLParsingMethods.LXML_ELEMENTTREE):\n print(d)\n\nNo type conversion is performed right now. A value in the output\ndictionary can have the type ``dict`` (a subdocument), ``list`` (an\narray of similar documents), ``str`` (a leaf or value) or ``None``\n(empty XML leaf tag). All keys are of the type ``str``.\n\nTests\n~~~~~\n\nTests are run with ``pytest``:\n\n.. code:: bash\n\n $ py.test tests/\n ============================= test session starts ==============================\n platform linux2 -- Python 2.7.6, pytest-2.9.1, py-1.4.31, pluggy-0.3.1\n rootdir: /home/hbldh/Repos/xmlr, inifile:\n collected 50 items\n\n tests/test_iter.py ...........................\n tests/test_methods.py ..\n tests/test_parsing.py .....................\n\n ========================== 50 passed in 0.50 seconds ===========================\n\nThe tests fetches some XML documents from `W3Schools XML tutorials`_ and\nalso uses a bundled, slimmed down version of the document available at\n`U.S. copyright renewal records available for download\n`_.\n\n\n.. _W3Schools XML tutorials: http://www.w3schools.com/xml/xml_examples.asp\n\n.. |Build Status| image:: https://travis-ci.org/hbldh/xmlr.svg?branch=master\n :target: https://travis-ci.org/hbldh/xmlr\n.. |Coverage Status| image:: https://coveralls.io/repos/github/hbldh/xmlr/badge.svg?branch=master\n :target: https://coveralls.io/github/hbldh/xmlr?branch=master\n\n\n\n\nv0.3.1 (2016-08-16)\n===================\n- Made available on PyPi\n\nv0.3.0 (2016-05-23)\n===================\n- Renaming from `xmller` to `xmlr`.\n- General improvements.\n- Test coverage increased.\n- More documentation.\n- Development Status classifier increased from Alpha to Beta.\n\nv0.2.0 (2016-05-20)\n===================\n- Bugfixes.\n- `xmliter` method written.\n- `lxml` support added.\n- Improved parser selection.\n- Increased test coverage to >90%.\n\nv0.1.0 (2016-05-17)\n===================\n- Initial release", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/hbldh/xmlr", "keywords": "XML,parsing,json,conversion", "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "xmlr", "package_url": "https://pypi.org/project/xmlr/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/xmlr/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/hbldh/xmlr" }, "release_url": "https://pypi.org/project/xmlr/0.3.1/", "requires_dist": null, "requires_python": null, "summary": "XML parsing package for very large files", "version": "0.3.1" }, "last_serial": 2283777, "releases": { "0.3.1": [ { "comment_text": "", "digests": { "md5": "c9e902b631655b7438235e805929b0d5", "sha256": "2ffa67dfb27eee514ff30eeb8e4b67830456ee5d80822e0c3a940f39d796f37b" }, "downloads": -1, "filename": "xmlr-0.3.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "c9e902b631655b7438235e805929b0d5", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 11469, "upload_time": "2016-08-16T08:53:00", "url": "https://files.pythonhosted.org/packages/ac/ac/afc64d4cf5fd64bed18390e1b59058373f86f548b7caba2f80df68f5bf2f/xmlr-0.3.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cdcfd6aa5eaab531ca434b7ac6ff2f8d", "sha256": "5fed62f9a7f963796d1c01fdb36ebe0dee0c44f18ea281dc2674bb3530715516" }, "downloads": -1, "filename": "xmlr-0.3.1.tar.gz", "has_sig": false, "md5_digest": "cdcfd6aa5eaab531ca434b7ac6ff2f8d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8068, "upload_time": "2016-08-16T08:52:47", "url": "https://files.pythonhosted.org/packages/15/8c/0048af8beff056b21fad4002a56233a9827ddc157dec38ba06f4413797b9/xmlr-0.3.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c9e902b631655b7438235e805929b0d5", "sha256": "2ffa67dfb27eee514ff30eeb8e4b67830456ee5d80822e0c3a940f39d796f37b" }, "downloads": -1, "filename": "xmlr-0.3.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "c9e902b631655b7438235e805929b0d5", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 11469, "upload_time": "2016-08-16T08:53:00", "url": "https://files.pythonhosted.org/packages/ac/ac/afc64d4cf5fd64bed18390e1b59058373f86f548b7caba2f80df68f5bf2f/xmlr-0.3.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cdcfd6aa5eaab531ca434b7ac6ff2f8d", "sha256": "5fed62f9a7f963796d1c01fdb36ebe0dee0c44f18ea281dc2674bb3530715516" }, "downloads": -1, "filename": "xmlr-0.3.1.tar.gz", "has_sig": false, "md5_digest": "cdcfd6aa5eaab531ca434b7ac6ff2f8d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8068, "upload_time": "2016-08-16T08:52:47", "url": "https://files.pythonhosted.org/packages/15/8c/0048af8beff056b21fad4002a56233a9827ddc157dec38ba06f4413797b9/xmlr-0.3.1.tar.gz" } ] }