{ "info": { "author": "Nike Gurin-Petrovych", "author_email": "nike.gurin@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: POSIX" ], "description": "speedparser3\n------------\n\nSpeedparser3 is a Python 3.5+ version of the original Speedparser by https://github.com/jmoiron/speedparser.\n\nspeedparser\n-----------\n\nSpeedparser is a black-box \"style\" reimplementation of the `Universal Feed\nParser `_. It uses some feedparser code\nfor date and authors, but mostly re-implements its data normalization algorithms\nbased on feedparser output. It uses ``lxml`` for feed parsing and for optional\nHTML cleaning. Its compatibility with ``feedparser`` is very good for a strict\nsubset of fields, but poor for fields outside that subset. See\n``tests/speedparsertests.py`` for more information on which fields are more or\nless compatible and which are not.\n\nOn an Intel(R) Core(TM) i5 750, running only on one core, ``feedparser`` managed\n``2.5 feeds/sec`` on the test feed set (roughly 4200 \"feeds\" in \n``tests/feeds.tar.bz2``), while ``speedparser`` manages around ``65 feeds/sec``\nwith HTML cleaning on and ``200 feeds/sec`` with cleaning off.\n\ninstalling\n----------\n\n``pip3 install speedparser3``\n\nusage\n-----\n\nUsage is similar to feedparser::\n\n >>> import speedparser3\n >>> result = speedparser3.parse(feed)\n >>> result = speedparser3.parse(feed, clean_html=False)\n\ndifferences\n-----------\n\nThere are a few interface differences and many result differences between\nspeedparser3 and feedparser. The biggest similarity is that they both return\na ``FeedParserDict()`` object (with keys accessible as attributes), they both\nset the ``bozo`` key when an error is encountered, and various aspects of the\n``feed`` and ``entries`` keys are likely to be identical *or* very similar.\n\n``speedparser3`` uses different (and in some cases less or none; buyer beware)\ndata cleaning algorithms than ``feedparser``. When it is enabled, lxml's\n``html.cleaner`` library will be used to clean HTML and give similar but not\nidentical protection against various attributes and elements. If you supply\nyour own ``Cleaner`` element to the \"``clean_html`` kwarg, it will be used\nby ``speedparser3`` to clean the various attributes of the feed and entries.\n\n``speedparser3`` does not attempt to fix character encoding by default because\nthis processing can take a long time for large feeds. If the encoding value of\nthe feed is wrong, or if you want this extra level of error tollerance, you\ncan either use the ``chardet`` module to detect the encoding based on the\ndocument or pass ``encoding=True`` to ``speedparser3.parse`` and it will fall\nback to encoding detection if it encounters encoding errors.\n\nIf your application is using ``feedparser`` to consume many feeds at once and\nCPU is becoming a bottleneck, you might want to try out ``speedparser3`` as an\nalternative (using ``feedparser`` as a backup). If you are writing an\napplication that does not ingest many feeds, or where CPU is not a problem,\nyou should use ``feedparser`` as it is flexible with bad or malformed data and\nhas a much better test suite.", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/nikegp/speedparser3", "keywords": "feedparser speedparser rss atom rdf lxml python3", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "speedparser3", "package_url": "https://pypi.org/project/speedparser3/", "platform": "", "project_url": "https://pypi.org/project/speedparser3/", "project_urls": { "Homepage": "https://github.com/nikegp/speedparser3" }, "release_url": "https://pypi.org/project/speedparser3/0.3.1/", "requires_dist": null, "requires_python": ">=3.5", "summary": "Python3 version of speedparser https://github.com/jmoiron/speedparser", "version": "0.3.1" }, "last_serial": 2997948, "releases": { "0.3.0": [ { "comment_text": "", "digests": { "md5": "d120bc6ed8909c4e4d3f2c1cd1448c75", "sha256": "d2bc687ddab89d34b773cdb52baf76c9eb7a67192bb98c1a2736ffde5593f2db" }, "downloads": -1, "filename": "speedparser3-0.3.0.tar.gz", "has_sig": false, "md5_digest": "d120bc6ed8909c4e4d3f2c1cd1448c75", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 16736, "upload_time": "2017-07-03T22:29:00", "url": "https://files.pythonhosted.org/packages/af/3a/b87f8dcfe97db63ba328b105428fe49f5242bd926454a4c295f7879ac9d8/speedparser3-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "d1df234b48f9867769fc1acf15daff2f", "sha256": "3e1ad8140fc2d07e2dc94eb04c7be8e8e8d57b354210f410e82fbfc00ad101c9" }, "downloads": -1, "filename": "speedparser3-0.3.1.tar.gz", "has_sig": false, "md5_digest": "d1df234b48f9867769fc1acf15daff2f", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 16752, "upload_time": "2017-07-03T22:37:08", "url": "https://files.pythonhosted.org/packages/f9/f0/550bc1b13665dd8ddbba496a837ca12cf8df0aadd286f9fc4874ee1e7e5c/speedparser3-0.3.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d1df234b48f9867769fc1acf15daff2f", "sha256": "3e1ad8140fc2d07e2dc94eb04c7be8e8e8d57b354210f410e82fbfc00ad101c9" }, "downloads": -1, "filename": "speedparser3-0.3.1.tar.gz", "has_sig": false, "md5_digest": "d1df234b48f9867769fc1acf15daff2f", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 16752, "upload_time": "2017-07-03T22:37:08", "url": "https://files.pythonhosted.org/packages/f9/f0/550bc1b13665dd8ddbba496a837ca12cf8df0aadd286f9fc4874ee1e7e5c/speedparser3-0.3.1.tar.gz" } ] }