{ "info": { "author": "Anubhav Patel", "author_email": "anubhavp28@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy" ], "description": "# Protego\n\n![build-badge](https://api.travis-ci.com/scrapy/protego.svg?branch=master)\n[![made-with-python](https://img.shields.io/badge/Made%20with-Python-1f425f.svg)](https://www.python.org/)\n## Overview\nProtego is a pure-Python `robots.txt` parser with support for modern conventions.\n\n## Requirements\n* Python 2.7 or Python 3.5+\n* Works on Linux, Windows, Mac OSX, BSD\n\n## Install\n\nTo install Protego, simply use pip:\n\n```\npip install protego\n```\n\n## Usage\n\n```python\n>>> from protego import Protego\n>>> robotstxt = \"\"\"\n... User-agent: *\n... Disallow: /\n... Allow: /about\n... Allow: /account\n... Disallow: /account/contact$\n... Disallow: /account/*/profile\n... Crawl-delay: 4\n... Request-rate: 10/1m # 10 requests every 1 minute\n... \n... Sitemap: http://example.com/sitemap-index.xml\n... Host: http://example.co.in\n... \"\"\"\n>>> rp = Protego.parse(robotstxt)\n>>> rp.can_fetch(\"http://example.com/profiles\", \"mybot\")\nFalse\n>>> rp.can_fetch(\"http://example.com/about\", \"mybot\")\nTrue\n>>> rp.can_fetch(\"http://example.com/account\", \"mybot\")\nTrue\n>>> rp.can_fetch(\"http://example.com/account/myuser/profile\", \"mybot\")\nFalse\n>>> rp.can_fetch(\"http://example.com/account/contact\", \"mybot\")\nFalse\n>>> rp.crawl_delay(\"mybot\")\n4.0\n>>> rp.request_rate(\"mybot\")\nRequestRate(requests=10, seconds=60, start_time=None, end_time=None)\n>>> list(rp.sitemaps)\n['http://example.com/sitemap-index.xml']\n>>> rp.preferred_host\n'http://example.co.in'\n```\n\nUsing Protego with [Requests](https://3.python-requests.org/)\n\n```python\n>>> from protego import Protego\n>>> import requests\n>>> r = requests.get(\"https://google.com/robots.txt\")\n>>> rp = Protego.parse(r.text)\n>>> rp.can_fetch(\"https://google.com/search\", \"mybot\")\nFalse\n>>> rp.can_fetch(\"https://google.com/search/about\", \"mybot\")\nTrue\n>>> list(rp.sitemaps)\n['https://www.google.com/sitemap.xml']\n```\n\n## Documentation\n\nClass `protego.Protego`:\n \n### Properties\n\n* `sitemaps` {`list_iterator`} A list of sitemaps specified in `robots.txt`.\n* `preferred_host` {string} Preferred host specified in `robots.txt`.\n\n### Methods\n\n* `parse(robotstxt_body)` Parse `robots.txt` and return a new instance of `protego.Protego`. \n* `can_fetch(url, user_agent)` Return True if the user agent can fetch the URL, otherwise return False.\n* `crawl_delay(user_agent)` Return the crawl delay specified for the user agent as a float. If nothing is specified, return None.\n* `request_rate(user_agent)` Return the request rate specified for the user agent as a named tuple `RequestRate(requests, seconds, start_time, end_time)`. If nothing is specified, return None.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "robots.txt,parser,robots,rep", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "Protego", "package_url": "https://pypi.org/project/Protego/", "platform": "", "project_url": "https://pypi.org/project/Protego/", "project_urls": null, "release_url": "https://pypi.org/project/Protego/0.1.15/", "requires_dist": null, "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*", "summary": "Pure-Python robots.txt parser with support for modern conventions", "version": "0.1.15" }, "last_serial": 5744404, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "4f433acc4ee1d9e5c9b84db8f383d861", "sha256": "0ea25334ee0aeda9d7d4070aedde99b92d01bcb48d7fa39a8ef4bc508643c3d7" }, "downloads": -1, "filename": "Protego-0.1.tar.gz", "has_sig": false, "md5_digest": "4f433acc4ee1d9e5c9b84db8f383d861", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*", "size": 3847679, "upload_time": "2019-08-19T07:32:38", "url": "https://files.pythonhosted.org/packages/18/63/92c44187989bf491b6c099f6d07e767e6a7762a940115050fa607d6e892b/Protego-0.1.tar.gz" } ], "0.1.12": [ { "comment_text": "", "digests": { "md5": "440e625453474c2d1152b659a28b8618", "sha256": "da21230e3ef2c64951a22b820e09d7054b17900c32ae03c1144ed8f89774d8bf" }, "downloads": -1, "filename": "Protego-0.1.12.tar.gz", "has_sig": false, "md5_digest": "440e625453474c2d1152b659a28b8618", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*", "size": 3211083, "upload_time": "2019-08-28T17:35:52", "url": "https://files.pythonhosted.org/packages/02/5a/de3aebd2406a16cc1d7c12373ec814a35c7561e4717d406a1eef4762ae10/Protego-0.1.12.tar.gz" } ], "0.1.14": [ { "comment_text": "", "digests": { "md5": "5754701436fdc5914b362400a82c947e", "sha256": "f2ac0020cb74ce536760db4045aeff2e81b360a8fecbd993c2d542695006d5be" }, "downloads": -1, "filename": "Protego-0.1.14.tar.gz", "has_sig": false, "md5_digest": "5754701436fdc5914b362400a82c947e", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*", "size": 3212969, "upload_time": "2019-08-28T17:56:09", "url": "https://files.pythonhosted.org/packages/e9/78/a6b2ee370a1bf989595b0c8b2d01a315333740d36c6dc4b95661f7f4010d/Protego-0.1.14.tar.gz" } ], "0.1.15": [ { "comment_text": "", "digests": { "md5": "69532574f2a169c6fa9ec7d5db375447", "sha256": "457238bc55ce864547cc8fe45f9dbfd0161b9d27d94f2eb313cc28f3b9145186" }, "downloads": -1, "filename": "Protego-0.1.15.tar.gz", "has_sig": false, "md5_digest": "69532574f2a169c6fa9ec7d5db375447", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*", "size": 3211239, "upload_time": "2019-08-28T18:01:41", "url": "https://files.pythonhosted.org/packages/e8/4b/c72e7d801facc2f519824680b65d76373e6bb289df668dbf8758ea21ff10/Protego-0.1.15.tar.gz" } ], "0.1.dev0": [ { "comment_text": "", "digests": { "md5": "42bc4c9c481da417b320c7578a791e12", "sha256": "860f5734f43f2bdb96b90457772fa9d198c9d7a31c3ddf24195d1798efbae2fb" }, "downloads": -1, "filename": "Protego-0.1.dev0-py2.7.egg", "has_sig": false, "md5_digest": "42bc4c9c481da417b320c7578a791e12", "packagetype": "bdist_egg", "python_version": "2.7", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*", "size": 9284, "upload_time": "2019-07-18T12:39:42", "url": "https://files.pythonhosted.org/packages/95/95/92cdfab438613721c84e550fbfccd9e3a7393de307ca2ee3a62774f5aef0/Protego-0.1.dev0-py2.7.egg" }, { "comment_text": "", "digests": { "md5": "438b39f45b2c719a28e308b2c971b8b9", "sha256": "a688ba1f8fae4a8968730f32b3ca3a274ca464a058267a7c4a987b20d2b6b86f" }, "downloads": -1, "filename": "Protego-0.1.dev0-py3.7.egg", "has_sig": false, "md5_digest": "438b39f45b2c719a28e308b2c971b8b9", "packagetype": "bdist_egg", "python_version": "3.7", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*", "size": 9281, "upload_time": "2019-07-18T12:39:46", "url": "https://files.pythonhosted.org/packages/17/f7/b70f56760c04b3c7f76d4f61e9879ef18e69728a4a1b86954240dde87155/Protego-0.1.dev0-py3.7.egg" }, { "comment_text": "", "digests": { "md5": "bf3c4c823aba0244dd8abcf637ab9909", "sha256": "adda509c6926c1c18b06445cc1b6b71e95e43ef17afc4ebd0d02ad313f954947" }, "downloads": -1, "filename": "Protego-0.1.dev0.tar.gz", "has_sig": false, "md5_digest": "bf3c4c823aba0244dd8abcf637ab9909", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*", "size": 4924, "upload_time": "2019-07-18T12:39:47", "url": "https://files.pythonhosted.org/packages/a0/59/01c3a825711cfa01583144fe1f7f05bd59661b700a9638daa846466a59f1/Protego-0.1.dev0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "69532574f2a169c6fa9ec7d5db375447", "sha256": "457238bc55ce864547cc8fe45f9dbfd0161b9d27d94f2eb313cc28f3b9145186" }, "downloads": -1, "filename": "Protego-0.1.15.tar.gz", "has_sig": false, "md5_digest": "69532574f2a169c6fa9ec7d5db375447", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*", "size": 3211239, "upload_time": "2019-08-28T18:01:41", "url": "https://files.pythonhosted.org/packages/e8/4b/c72e7d801facc2f519824680b65d76373e6bb289df668dbf8758ea21ff10/Protego-0.1.15.tar.gz" } ] }