{ "info": { "author": "Muthu Rajendran R G", "author_email": "muthurajendranrg@gmail.com", "bugtrack_url": null, "classifiers": [ "Environment :: Web Environment", "Intended Audience :: Developers", "Operating System :: OS Independent", "Programming Language :: Python" ], "description": "This code is under the Apache License 2.0. http://www.apache.org/licenses/LICENSE-2.0\n\nThis is a modfied for higher recall rate of famous python port(https://github.com/buriy/python-readability) a ruby port of arc90's readability project\n\nhttp://lab.arc90.com/experiments/readability/\nhttps://github.com/buriy/python-readability\n\nIn few words,\nGiven a html document, it pulls out the main body text and cleans it up.\nIt also can clean up title based on latest readability.js code.\n\nBased on:\n - Latest readability.js ( https://github.com/MHordecki/readability-redux/blob/master/readability/readability.js )\n - Ruby port by starrhorne and iterationlabs\n - Python port by gfxmonk ( https://github.com/gfxmonk/python-readability , based on BeautifulSoup )\n - Python port by buriy (https://github.com/buriy/python-readability)\n - Decruft effort to move to lxml ( http://www.minvolai.com/blog/decruft-arc90s-readability-in-python/ )\n - \"BR to P\" fix from readability.js which improves quality for smaller texts.\n - Github users contributions.\n\nInstallation::\n\n pip install readability-dig\n\nUsage::\n\n from readability.readability import Document\n import urllib\n html = urllib.urlopen(url).read()\n readable_article = Document(html).summary()\n readable_title = Document(html).short_title()\n\nCommand-line usage::\n\n python -m readability.readability -u http://pypi.python.org/pypi/readability-dig\n\nTo open resulting page in browser::\n\n python -m readability.readability -b -u http://pypi.python.org/pypi/readability-dig\n\nUsing positive/negative keywords example::\n\n python -m readability.readability -p intro -n newsindex,homepage-box,news-section -u http://python.org\n\n\nDocument() kwarg options:\n\n - attributes:\n - debug: output debug messages\n - min_text_length:\n - retry_length:\n - url: will allow adjusting links to be absolute\n - positive_keywords: the list of positive search patterns in classes and ids, for example: [\"news-item\", \"block\"]\n - negative_keywords: the list of negative search patterns in classes and ids, for example: [\"mysidebar\", \"related\", \"ads\"]", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/muthurajendran/python-readability-dig/tarball/0.1", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/muthurajendran/python-readability-dig", "keywords": null, "license": "Apache License 2.0", "maintainer": null, "maintainer_email": null, "name": "readability-dig", "package_url": "https://pypi.org/project/readability-dig/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/readability-dig/", "project_urls": { "Download": "https://github.com/muthurajendran/python-readability-dig/tarball/0.1", "Homepage": "https://github.com/muthurajendran/python-readability-dig" }, "release_url": "https://pypi.org/project/readability-dig/0.5/", "requires_dist": null, "requires_python": null, "summary": "Modified arc90's scraping hub port readability tool for dig data", "version": "0.5" }, "last_serial": 2064445, "releases": { "0.1": [], "0.4": [ { "comment_text": "", "digests": { "md5": "987e00cfd25706b82d76061c1642ae80", "sha256": "26ad3deb7e65faf5b46cf7b8015288420b08c53563dd7af4074ddc5ed1ccf1b7" }, "downloads": -1, "filename": "readability-dig-0.4.tar.gz", "has_sig": false, "md5_digest": "987e00cfd25706b82d76061c1642ae80", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12516, "upload_time": "2016-04-14T20:04:55", "url": "https://files.pythonhosted.org/packages/12/bd/a16262ebfd6bf58ae2774a7ee57c4e41de8930b9b394223bbec722fb6464/readability-dig-0.4.tar.gz" } ], "0.5": [ { "comment_text": "", "digests": { "md5": "f4525f050ea8a2c6e0bb938baac29157", "sha256": "b8cede718d7b513b19323e08bb989164a60731b5668a837db104a778aa2139b2" }, "downloads": -1, "filename": "readability-dig-0.5.tar.gz", "has_sig": false, "md5_digest": "f4525f050ea8a2c6e0bb938baac29157", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12506, "upload_time": "2016-04-14T20:12:26", "url": "https://files.pythonhosted.org/packages/89/46/81f1b28340026570ea4df48ff80a4d7e75dee1ecd6df0a83b92af40dc17a/readability-dig-0.5.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f4525f050ea8a2c6e0bb938baac29157", "sha256": "b8cede718d7b513b19323e08bb989164a60731b5668a837db104a778aa2139b2" }, "downloads": -1, "filename": "readability-dig-0.5.tar.gz", "has_sig": false, "md5_digest": "f4525f050ea8a2c6e0bb938baac29157", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12506, "upload_time": "2016-04-14T20:12:26", "url": "https://files.pythonhosted.org/packages/89/46/81f1b28340026570ea4df48ff80a4d7e75dee1ecd6df0a83b92af40dc17a/readability-dig-0.5.tar.gz" } ] }