{ "info": { "author": "William Forde", "author_email": "willforde@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Text Processing :: Markup :: HTML" ], "description": ".. image:: https://badge.fury.io/py/htmlement.svg\n :target: https://pypi.python.org/pypi/htmlement\n\n.. image:: https://readthedocs.org/projects/python-htmlement/badge/?version=stable\n :target: http://python-htmlement.readthedocs.io/en/stable/?badge=stable\n\n.. image:: https://travis-ci.org/willforde/python-htmlement.svg?branch=master\n :target: https://travis-ci.org/willforde/python-htmlement\n\n.. image:: https://coveralls.io/repos/github/willforde/python-htmlement/badge.svg?branch=master\n :target: https://coveralls.io/github/willforde/python-htmlement?branch=master\n\n.. image:: https://api.codacy.com/project/badge/Grade/6b46406e1aa24b95947b3da6c09a4ab5\n :target: https://www.codacy.com/app/willforde/python-htmlement?utm_source=github.com&utm_medium=referral&utm_content=willforde/python-htmlement&utm_campaign=Badge_Grade\n\n.. image:: https://img.shields.io/pypi/pyversions/htmlement.svg\n :target: https://pypi.python.org/pypi/htmlement\n\n.. image:: https://img.shields.io/badge/Say%20Thanks-!-1EAEDB.svg\n :target: https://saythanks.io/to/willforde\n\nHTMLement\n---------\n\nHTMLement is a pure Python HTML Parser.\n\nThe object of this project is to be a \"pure-python HTML parser\" which is also \"faster\" than \"beautifulsoup\".\nAnd like \"beautifulsoup\", will also parse invalid html.\n\nThe most simple way to do this is to use ElementTree `XPath expressions`__.\nPython does support a simple (read limited) XPath engine inside its \"ElementTree\" module.\nA benefit of using \"ElementTree\" is that it can use a \"C implementation\" whenever available.\n\nThis \"HTML Parser\" extends `html.parser.HTMLParser`_ to build a tree of `ElementTree.Element`_ instances.\n\nInstall\n-------\nRun ::\n\n pip install htmlement\n\n-or- ::\n\n pip install git+https://github.com/willforde/python-htmlement.git\n\nParsing HTML\n------------\nHere I\u2019ll be using a sample \"HTML document\" that will be \"parsed\" using \"htmlement\": ::\n\n html = \"\"\"\n \n
\n