{ "info": { "author": "Tal Einat", "author_email": "taleinat@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: Implementation :: CPython", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "======================\nhtmldammit\n======================\n\n.. image:: https://img.shields.io/pypi/v/htmldammit.svg?style=flat\n :target: https://pypi.python.org/pypi/htmldammit\n :alt: Latest Version\n\n.. image:: https://img.shields.io/travis/taleinat/htmldammit/master.svg?style=flat\n :target: https://travis-ci.org/taleinat/htmldammit\n :alt: Build & Tests Status\n\n.. image:: https://img.shields.io/coveralls/taleinat/htmldammit/master.svg?style=flat\n :target: https://coveralls.io/r/taleinat/htmldammit\n :alt: Test Coverage\n\n.. image:: https://img.shields.io/pypi/l/htmldammit.svg?style=flat\n :target: https://github.com/taleinat/htmldammit/blob/master/LICENSE\n :alt: License: MIT\n\nMake every effort to properly decode HTML, because HTML is unicode, dammit!\n\nFeatures\n--------\n\n* Very easy to use with integrations for ``requests`` and ``urlopen()``.\n* Utilizes information from HTTP headers, inline encoding declarations,\n and UTF BOM-s (Byte Order Marks), as well as falling back to making a\n best guess based on the raw data.\n* Improves upon BeautifulSoup's great ``UnicodeDammit`` utility.\n\nInstallation\n------------\n\n.. code::\n\n pip install htmldammit\n\nAdditionally, it is *highly* recommended to install the ``cchardet`` and/or\nthe ``chardet`` libraries. This will enable the fallback to guessing the\nencoding based on the raw data.\n\n.. code::\n\n pip install cchardet chardet\n\nBasic usage\n-----------\n\nTo decode any binary HTML content into unicode (passing HTTP headers is optional):\n\n.. code:: python\n\n from htmldammit import decode_html\n html = decode_html(raw_html, http_headers)\n\nTo get unicode HTML from a ``requests`` response:\n\n.. code:: python\n\n from htmldammit.integrations.requests import get_response_html\n response = requests.get('http://www.example.org/')\n html = get_response_html(response)\n\nTo get unicode HTML from a ``urlopen()`` response:\n\n.. code:: python\n\n from htmldammit.integrations.urllib import get_response_html\n response = urlopen('http://www.example.org/')\n html = get_response_html(response)\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/taleinat/htmldammit", "keywords": "htmldammit HTML unicode", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "htmldammit", "package_url": "https://pypi.org/project/htmldammit/", "platform": "", "project_url": "https://pypi.org/project/htmldammit/", "project_urls": { "Homepage": "https://github.com/taleinat/htmldammit" }, "release_url": "https://pypi.org/project/htmldammit/0.1.1/", "requires_dist": [ "six", "beautifulsoup4" ], "requires_python": "", "summary": "Make every effort to properly decode HTML, because HTML is unicode, dammit!", "version": "0.1.1" }, "last_serial": 3468534, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "86ed78894219f083d16496cb4bd76f4e", "sha256": "e0d0e7defdb3eb753c5056ca16128c5ec0775b131ddf2b40b6d7e86f426f63b6" }, "downloads": -1, "filename": "htmldammit-0.1.1-py2.py3-none-any.whl", "has_sig": true, "md5_digest": "86ed78894219f083d16496cb4bd76f4e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8210, "upload_time": "2018-01-07T11:34:54", "url": "https://files.pythonhosted.org/packages/9b/b4/92f2301a16e9db56f6022981d862e9e2719a44c17d8948735257ea881569/htmldammit-0.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "bc2f0ddd6378375e49b72dd3d0af7556", "sha256": "f90df0fa93fc2c7898be2512ce466a0400f23741e128f61ac7c63f94c4316c21" }, "downloads": -1, "filename": "htmldammit-0.1.1.tar.gz", "has_sig": true, "md5_digest": "bc2f0ddd6378375e49b72dd3d0af7556", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5902, "upload_time": "2018-01-07T11:34:55", "url": "https://files.pythonhosted.org/packages/3e/04/33da2e7a32eb3d0cf00db530d1ce737dfdfe8d28678960876d7ecc18a6c5/htmldammit-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "86ed78894219f083d16496cb4bd76f4e", "sha256": "e0d0e7defdb3eb753c5056ca16128c5ec0775b131ddf2b40b6d7e86f426f63b6" }, "downloads": -1, "filename": "htmldammit-0.1.1-py2.py3-none-any.whl", "has_sig": true, "md5_digest": "86ed78894219f083d16496cb4bd76f4e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8210, "upload_time": "2018-01-07T11:34:54", "url": "https://files.pythonhosted.org/packages/9b/b4/92f2301a16e9db56f6022981d862e9e2719a44c17d8948735257ea881569/htmldammit-0.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "bc2f0ddd6378375e49b72dd3d0af7556", "sha256": "f90df0fa93fc2c7898be2512ce466a0400f23741e128f61ac7c63f94c4316c21" }, "downloads": -1, "filename": "htmldammit-0.1.1.tar.gz", "has_sig": true, "md5_digest": "bc2f0ddd6378375e49b72dd3d0af7556", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5902, "upload_time": "2018-01-07T11:34:55", "url": "https://files.pythonhosted.org/packages/3e/04/33da2e7a32eb3d0cf00db530d1ce737dfdfe8d28678960876d7ecc18a6c5/htmldammit-0.1.1.tar.gz" } ] }