{ "info": { "author": "Vitalii", "author_email": "moskrc@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: Console", "Environment :: Web Environment", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3 :: Only", "Topic :: Internet", "Topic :: Internet :: WWW/HTTP :: Indexing/Search", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "## About CrawlerDetect\n\n**CrawlerDetect** is a Python version of PHP class @[CrawlerDetect](https://github.com/JayBizzle/Crawler-Detect).\n\nIt helps to detect bots/crawlers/spiders via the user agent and other HTTP-headers. Currently able to detect 1,000's of bots/spiders/crawlers.\n\n### Installation\nRun `pip install crawlerdetect`\n\n### Usage\n\n#### Variant 1\n```Python\nfrom crawlerdetect import CrawlerDetect\ncrawler_detect = CrawlerDetect()\ncrawler_detect.isCrawler('Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)')\n# true if crawler user agent detected\n```\n\n#### Variant 2\n```Python\nfrom crawlerdetect import CrawlerDetect\ncrawler_detect = CrawlerDetect(user_agent='Mozilla/5.0 (iPhone; CPU iPhone OS 7_1 like Mac OS X) AppleWebKit (KHTML, like Gecko) Mobile (compatible; Yahoo Ad monitoring; https://help.yahoo.com/kb/yahoo-ad-monitoring-SLN24857.html)')\ncrawler_detect.isCrawler()\n# true if crawler user agent detected\n```\n\n#### Variant 3\n```Python\nfrom crawlerdetect import CrawlerDetect\ncrawler_detect = CrawlerDetect(headers={'DOCUMENT_ROOT': '/home/test/public_html', 'GATEWAY_INTERFACE': 'CGI/1.1', 'HTTP_ACCEPT': '*/*', 'HTTP_ACCEPT_ENCODING': 'gzip, deflate', 'HTTP_CACHE_CONTROL': 'no-cache', 'HTTP_CONNECTION': 'Keep-Alive', 'HTTP_FROM': 'googlebot(at)googlebot.com', 'HTTP_HOST': 'www.test.com', 'HTTP_PRAGMA': 'no-cache', 'HTTP_USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.71 Safari/537.36', 'PATH': '/bin:/usr/bin', 'QUERY_STRING': 'order=closingDate', 'REDIRECT_STATUS': '200', 'REMOTE_ADDR': '127.0.0.1', 'REMOTE_PORT': '3360', 'REQUEST_METHOD': 'GET', 'REQUEST_URI': '/?test=testing', 'SCRIPT_FILENAME': '/home/test/public_html/index.php', 'SCRIPT_NAME': '/index.php', 'SERVER_ADDR': '127.0.0.1', 'SERVER_ADMIN': 'webmaster@test.com', 'SERVER_NAME': 'www.test.com', 'SERVER_PORT': '80', 'SERVER_PROTOCOL': 'HTTP/1.1', 'SERVER_SIGNATURE': '', 'SERVER_SOFTWARE': 'Apache', 'UNIQUE_ID': 'Vx6MENRxerBUSDEQgFLAAAAAS', 'PHP_SELF': '/index.php', 'REQUEST_TIME_FLOAT': 1461619728.0705, 'REQUEST_TIME': 1461619728})\ncrawler_detect.isCrawler()\n# true if crawler user agent detected\n```\n#### Output the name of the bot that matched (if any)\n```Python\nfrom crawlerdetect import CrawlerDetect\ncrawler_detect = CrawlerDetect()\ncrawler_detect.isCrawler('Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)')\n# true if crawler user agent detected\ncrawler_detect.getMatches()\n# Sosospider\n```\n\n### Contributing\nIf you find a bot/spider/crawler user agent that CrawlerDetect fails to detect, please submit a pull request with the regex pattern added to the array in `providers/crawlers.py` and add the failing user agent to `tests/crawlers.txt`.\n\nFailing that, just create an issue with the user agent you have found, and we'll take it from there :)\n\n### ES6 Library\nTo use this library with NodeJS or any ES6 application based, check out [es6-crawler-detect](https://github.com/JefferyHus/es6-crawler-detect).\n\n### .NET Library\nTo use this library in a .net standard (including .net core) based project, check out [NetCrawlerDetect](https://github.com/gplumb/NetCrawlerDetect).\n\n### Nette Extension\nTo use this library with the Nette framework, checkout [NetteCrawlerDetect](https://github.com/JanGalek/Crawler-Detect).\n\n### Ruby Gem\n\nTo use this library with Ruby on Rails or any Ruby-based application, check out [crawler_detect](https://github.com/loadkpi/crawler_detect) gem.\n\n_Parts of this class are based on the brilliant [MobileDetect](https://github.com/serbanghita/Mobile-Detect)_\n\n[![Analytics](https://ga-beacon.appspot.com/UA-72430465-1/Crawler-Detect/readme?pixel)](https://github.com/JayBizzle/Crawler-Detect)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/moskrc/CrawlerDetect/tarball/0.1.4", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/moskrc/CrawlerDetect", "keywords": "crawler,crawler detect,crawler detector,crawlerdetect,python crawler detect", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "crawlerdetect", "package_url": "https://pypi.org/project/crawlerdetect/", "platform": "any", "project_url": "https://pypi.org/project/crawlerdetect/", "project_urls": { "Documentation": "https://crawlerdetect.readthedocs.io", "Download": "https://github.com/moskrc/CrawlerDetect/tarball/0.1.4", "Homepage": "https://github.com/moskrc/CrawlerDetect" }, "release_url": "https://pypi.org/project/crawlerdetect/0.1.4/", "requires_dist": null, "requires_python": ">=3.4, <4", "summary": "CrawlerDetect is a Python class for detecting bots/crawlers/spiders via the user agent.", "version": "0.1.4" }, "last_serial": 5686345, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "c0d35309076cbd35f3fe10b27f2f8b07", "sha256": "58e4903220f9980000ffca3990ab21be326b36577e12e41e46d5f1121344ef0c" }, "downloads": -1, "filename": "CrawlerDetect-0.0.1.tar.gz", "has_sig": false, "md5_digest": "c0d35309076cbd35f3fe10b27f2f8b07", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15433, "upload_time": "2019-08-15T13:58:18", "url": "https://files.pythonhosted.org/packages/46/7b/b6a12a89cb7a2e04147e8478ec2917fc18dedfa928954c86f896f4016e96/CrawlerDetect-0.0.1.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "1558094759eff6b19e512606dc44708f", "sha256": "0028720f63e98c46ecb86bdef515089f9366558f01dd31580291daaeb1434209" }, "downloads": -1, "filename": "CrawlerDetect-0.1.1.tar.gz", "has_sig": false, "md5_digest": "1558094759eff6b19e512606dc44708f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15588, "upload_time": "2019-08-16T07:02:43", "url": "https://files.pythonhosted.org/packages/55/ae/3c922fdb2887e4a966da5b4540864460e5214a12d626eed030d52d950f2d/CrawlerDetect-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "96ab7450cc5831ecae74b740e13f9b95", "sha256": "b961980570c044a9929405652a9658dee53ca10afca17cc2a4feb4c57ec4bc1e" }, "downloads": -1, "filename": "CrawlerDetect-0.1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "96ab7450cc5831ecae74b740e13f9b95", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 16974, "upload_time": "2019-08-16T07:17:35", "url": "https://files.pythonhosted.org/packages/13/9c/2b2c33b046fd649e48ea7c247e363bbce71c43469e93c968c1ef1aa757bc/CrawlerDetect-0.1.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e3fb21bbf864f864ec8f387203114090", "sha256": "4ba2a7528983c39049dac99d9f419ce429e5aaf20092f41f076ca7290ffbe553" }, "downloads": -1, "filename": "CrawlerDetect-0.1.2.tar.gz", "has_sig": false, "md5_digest": "e3fb21bbf864f864ec8f387203114090", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15589, "upload_time": "2019-08-16T07:16:09", "url": "https://files.pythonhosted.org/packages/41/41/6595d07297195c39ebe8fa6f0d54ea0657e577ecaf799963341841475652/CrawlerDetect-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "9751074e4f93f05c1c6a41d7d0a56664", "sha256": "60edd916190bf030c41cc93c83a4f5aaf6723d659a1673c82258a728b3a5934d" }, "downloads": -1, "filename": "CrawlerDetect-0.1.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "9751074e4f93f05c1c6a41d7d0a56664", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.4, <4", "size": 16974, "upload_time": "2019-08-16T07:18:08", "url": "https://files.pythonhosted.org/packages/be/c3/d4a21d561473602dc0eded3d845998bb4156b84d324a876e329f8589898d/CrawlerDetect-0.1.3-py2.py3-none-any.whl" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "cc710486898fb3658f4eb21b99918752", "sha256": "81793d14796b0707539e62855d18c0eb4af83f97297ca64f64f67a788781a5a2" }, "downloads": -1, "filename": "crawlerdetect-0.1.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "cc710486898fb3658f4eb21b99918752", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.4, <4", "size": 17019, "upload_time": "2019-08-16T07:46:31", "url": "https://files.pythonhosted.org/packages/5d/18/20e696aafb1a685fc5b8f9217e64bc615c73afaec4d2291159e707c62e14/crawlerdetect-0.1.4-py2.py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "cc710486898fb3658f4eb21b99918752", "sha256": "81793d14796b0707539e62855d18c0eb4af83f97297ca64f64f67a788781a5a2" }, "downloads": -1, "filename": "crawlerdetect-0.1.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "cc710486898fb3658f4eb21b99918752", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.4, <4", "size": 17019, "upload_time": "2019-08-16T07:46:31", "url": "https://files.pythonhosted.org/packages/5d/18/20e696aafb1a685fc5b8f9217e64bc615c73afaec4d2291159e707c62e14/crawlerdetect-0.1.4-py2.py3-none-any.whl" } ] }