{ "info": { "author": "Alex Wang", "author_email": "", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Environment :: Plugins", "Framework :: Scrapy", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Internet :: WWW/HTTP", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "====================\nScrapy-Rotated-Proxy\n====================\n\n.. image:: https://img.shields.io/pypi/v/scrapy-rotated-proxy.svg\n :target: https://pypi.python.org/pypi/scrapy-rotated-proxy\n :alt: PyPI Version\n\n.. image:: https://img.shields.io/travis/xiaowangwindow/scrapy-rotated-proxy/master.svg\n :target: http://travis-ci.org/xiaowangwindow/scrapy-rotated-proxy\n :alt: Build Status\n\nOverview\n########\n\nScrapy-Rotated-Proxy is a Scrapy downloadmiddleware to dynamically attach proxy to Request,\nwhich can repeately use rotated proxies supplied by configuration.\nIt can temporarily block unavailable proxy ip\nand retrieve to use in the future when the proxy is available.\nAlso, it can remove invalid proxy ip through Scrapy signal.\nProxy ip list can be supplied through Spider Settings, File or MongoDB.\n\nRequirements\n############\n\n* Python 2.7 or Python 3.3+\n* Works on Linux, Windows, Mac OSX, BSD\n\nInstall\n########\n\nThe quick way::\n\n pip install scrapy-rotated-proxy\n\nOR copy this middleware to your scrapy project.\n\nConfiguration\n#############\n\nBasic Configuration\n===================\n\nEnable with Spider Settings\n---------------------------\n\nenable scrapy-rotated-proxy middleware and supply proxy ip list through Spider Settings\n\n::\n\n # -----------------------------------------------------------------------------\n # ROTATED PROXY SETTINGS (Spider Settings Backend)\n # -----------------------------------------------------------------------------\n DOWNLOADER_MIDDLEWARES.update({\n 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,\n 'scrapy_rotated_proxy.downloadmiddlewares.proxy.RotatedProxyMiddleware': 750,\n })\n ROTATED_PROXY_ENABLED = True\n PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.file_storage.FileProxyStorage'\n # When set PROXY_FILE_PATH='', scrapy-rotated-proxy\n # will use proxy in Spider Settings default.\n PROXY_FILE_PATH = ''\n HTTP_PROXIES = [\n 'http://proxy0:8888',\n 'http://user:pass@proxy1:8888',\n 'https://user:pass@proxy1:8888',\n ]\n HTTPS_PROXIES = [\n 'http://proxy0:8888',\n 'http://user:pass@proxy1:8888',\n 'https://user:pass@proxy1:8888',\n ]\n\nEnable with Local File\n-----------------------\n\nenable scrapy-rotated-proxy middleware and supply proxy ip list through Local File\n\n::\n\n # -----------------------------------------------------------------------------\n # ROTATED PROXY SETTINGS (Local File Backend)\n # -----------------------------------------------------------------------------\n DOWNLOADER_MIDDLEWARES.update({\n 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,\n 'scrapy_rotated_proxy.downloadmiddlewares.proxy.RotatedProxyMiddleware': 750,\n })\n ROTATED_PROXY_ENABLED = True\n PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.file_storage.FileProxyStorage'\n PROXY_FILE_PATH = 'file_path/proxy.txt'\n\nlocal file store proxy list with json style\n\n::\n\n # proxy file content, must conform to json format, otherwise will cause json\n # load error\n HTTP_PROXIES = [\n 'http://proxy0:8888',\n 'http://user:pass@proxy1:8888',\n 'https://user:pass@proxy1:8888'\n ]\n HTTPS_PROXIES = [\n 'http://proxy0:8888',\n 'http://user:pass@proxy1:8888',\n 'https://user:pass@proxy1:8888'\n ]\n\nEnable with MongoDB\n-------------------\n\nenable `scrapy-rotated-proxy` middleware and supply proxy ip list through MongoDB\n\n::\n\n # -----------------------------------------------------------------------------\n # ROTATED PROXY SETTINGS (MongoDB Backend)\n # -----------------------------------------------------------------------------\n # mongodb document required field: scheme, ip, port, username, password\n # document example: {'scheme': 'http', 'ip': '10.0.0.1', 'port': 8080,\n # 'username':'user', 'password':'password'}\n DOWNLOADER_MIDDLEWARES.update({\n 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,\n 'scrapy_rotated_proxy.downloadmiddlewares.proxy.RotatedProxyMiddleware': 750,\n })\n ROTATED_PROXY_ENABLED = True\n PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.mongodb_storage.MongoDBProxyStorage'\n PROXY_MONGODB_HOST = HOST_OR_IP\n PROXY_MONGODB_PORT = 27017\n PROXY_MONGODB_USERNAME = USERNAME_OR_NONE\n PROXY_MONGODB_PASSWORD = PASSWORD_OR_NONE\n PROXY_MONGODB_AUTH_DB = 'admin'\n PROXY_MONGODB_DB = 'vps_management'\n PROXY_MONGODB_COLL = 'service'\n\nAdvanced Configuration\n======================\nBlock Settings\n--------------\n\nDefault, spider will close when run out of all proxies. you can config spider to\nwait until block proxies become valid, which you block by signal\n\n::\n\n # -----------------------------------------------------------------------------\n # OTHER SETTINGS (Optional)\n # -----------------------------------------------------------------------------\n PROXY_SLEEP_INTERVAL = 60*60*24 # Default 24hours\n PROXY_SPIDER_CLOSE_WHEN_NO_PROXY = False # Default True\n\nSignals\n-------\n\nRemove proxy that never be used in the spider, you can send signal to\n**scrapy_rotated_proxy.signals.proxy_remove**, which signal must contains arguments\nincluding ``spider``, ``request``, ``exception``\n\nBlock proxy that can be used in the future after sleep interval reach, you can send signal to\n**scrapy_rotated_proxy.signals.proxy_block**, which signal must contains arguments\nincluding ``spider``, ``response``, ``exception``\n\nSettings Reference\n###################\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| Setting | Description | Default |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| ROTATED_PROXY_ENABLED | Whether to enable Scrapy-Rotated-Proxy | True |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_STORAGE | A class which implements the proxy storage backend | FileProxyStorage |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_MONGODB_HOST | MongoDB host for MongoDB backend | '127.0.0.1' |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_MONGODB_PORT | MongoDB port for MongoDB backend | 27017 |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_MONGODB_USERNAME | MongoDB username for MongoDB backend | None |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_MOGNODB_PASSWORD | MongoDB password for MongoDB backend | None |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_MONGODB_DB | MongoDB database name for MongoDB backend | proxy_management |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_MONGODB_COLL | MongoDB collection name for MongoDB backend | proxy |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_MONGODB_OPTIONS_* | MongoDB uri options for MongoDB backend | |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_FILE_PATH | Path of file that store proxies. default is None, means get proxies from Spider Settings | None |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| HTTP_PROXIES | keywords of HTTP proxies for LocalFile backend or Spider Settings | |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| HTTPS_PROXIES | keywords of HTTPS proxies for LocalFile backend or Spider Settings | |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_SLEEP_INTERVAL | Time to sleep for blocked proxy become available | 60*60*24 |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_SPIDER_CLOSE_WHEN_NO_PROXY | Whether to close spider when run out of all proxies | True |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+\n| PROXY_RELOAD_ENABLED | enable to reload proxy from storage when all proxies was used and prepare to cycle use | False |\n+----------------------------------+------------------------------------------------------------------------------------------+------------------+", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/xiaowangwindow/scrapy-rotated-proxy", "keywords": "", "license": "BSD", "maintainer": "Alex Wang", "maintainer_email": "xiaowangwindow@163.com", "name": "scrapy-rotated-proxy", "package_url": "https://pypi.org/project/scrapy-rotated-proxy/", "platform": "", "project_url": "https://pypi.org/project/scrapy-rotated-proxy/", "project_urls": { "Homepage": "https://github.com/xiaowangwindow/scrapy-rotated-proxy" }, "release_url": "https://pypi.org/project/scrapy-rotated-proxy/0.1.5/", "requires_dist": null, "requires_python": "", "summary": "A middleware to change proxy rotated for Scrapy", "version": "0.1.5" }, "last_serial": 4198425, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "124b42030f7092ebdae0cf23457ce00a", "sha256": "f4ffc818b2cfa03b7cbffd931398e022de367115a8a82d46a35dac5f4ce2eeef" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.0.2.tar.gz", "has_sig": false, "md5_digest": "124b42030f7092ebdae0cf23457ce00a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10095, "upload_time": "2017-10-12T06:44:38", "url": "https://files.pythonhosted.org/packages/71/d8/52c3e160880aad3e47f33c42341eebcb5dc2e7d1ec3eb424306afc9bae42/scrapy-rotated-proxy-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "99c3e97a5f2abd5e4c42c5bc59673c43", "sha256": "9ab7b183d200293fd3c335b94ac9e0e4335da8dea9994b63023fbde5001f2417" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.0.3.tar.gz", "has_sig": false, "md5_digest": "99c3e97a5f2abd5e4c42c5bc59673c43", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9832, "upload_time": "2017-10-12T10:22:26", "url": "https://files.pythonhosted.org/packages/f6/72/118e8f8681ab953b9ed63b27e9ea50c88deea09bb6cf713fe18ea94cf582/scrapy-rotated-proxy-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "463be94604e3067fbe7373d766a21809", "sha256": "6fddf50f00c97c7a9952a986b32a8cb8dbd36769542bf6ff6fde6af1b0188101" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.0.4.tar.gz", "has_sig": false, "md5_digest": "463be94604e3067fbe7373d766a21809", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10008, "upload_time": "2017-10-13T05:05:42", "url": "https://files.pythonhosted.org/packages/41/f3/0a9946156d56fe381a3a5736c66ee11a368407810061a1f4c287e4f47921/scrapy-rotated-proxy-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "20d2ceb3e634f002db708ceb29377737", "sha256": "843d1ed58fd4e28120062661c746bf09a89e29eb0d7bde54c9cdb9044c8a2830" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.0.5.tar.gz", "has_sig": false, "md5_digest": "20d2ceb3e634f002db708ceb29377737", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10056, "upload_time": "2017-10-17T07:13:05", "url": "https://files.pythonhosted.org/packages/19/71/ffe8d38e7d7c7d4af701b3cd908d5fe6e71f64c22783bcaad79dd71a8ee9/scrapy-rotated-proxy-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "e2614ba474733bdd87dce42b502b1420", "sha256": "522712f6dab1ee1bb3334cc633039b5f82b1d512e811e0a97c2370bd68ec8c8d" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.0.6.tar.gz", "has_sig": false, "md5_digest": "e2614ba474733bdd87dce42b502b1420", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9410, "upload_time": "2017-10-19T01:54:18", "url": "https://files.pythonhosted.org/packages/97/6f/3fbfcd0d592d9157f2a38b95e169fd34aa5145b4efce0800ea5a95a242a8/scrapy-rotated-proxy-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "3f18cdd30e70201608a24bd796916766", "sha256": "a80d3dbd071ccc75286f9bfa8568ec0459b2b9df6a7c1599657a3431ac5b8f79" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.0.7.tar.gz", "has_sig": false, "md5_digest": "3f18cdd30e70201608a24bd796916766", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10207, "upload_time": "2017-10-27T01:29:22", "url": "https://files.pythonhosted.org/packages/18/35/395e301da5314b72fc6352e1f7ff3512cf03fb85a71d9c5e67108a242017/scrapy-rotated-proxy-0.0.7.tar.gz" } ], "0.0.8": [ { "comment_text": "", "digests": { "md5": "e8d8c9e6624213333fc912a93f830784", "sha256": "ffc44f88e2d30e6301ffde761930bdcf34abf468b1f6a090c423165ba3fa942c" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.0.8.tar.gz", "has_sig": false, "md5_digest": "e8d8c9e6624213333fc912a93f830784", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10230, "upload_time": "2017-10-30T06:24:28", "url": "https://files.pythonhosted.org/packages/5d/1e/ace56eaaa880090ffcfe2fc1c44fae0b6340ed55d3c6f4401d32835fc38c/scrapy-rotated-proxy-0.0.8.tar.gz" } ], "0.0.9": [ { "comment_text": "", "digests": { "md5": "73a612bc4334f7f7a0074361e0360ec6", "sha256": "a2e52969068192af8610eb7a3add98d3a6eac9c2073a1933395658d3ce4860e1" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.0.9.tar.gz", "has_sig": false, "md5_digest": "73a612bc4334f7f7a0074361e0360ec6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10503, "upload_time": "2017-11-01T02:33:26", "url": "https://files.pythonhosted.org/packages/4b/86/80fc76ccbde5ff8b7f76ab96fd7d82e94acc22a5a6f3578da15ce5256df2/scrapy-rotated-proxy-0.0.9.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "f1876129b1e495c903d0089063432b39", "sha256": "bd9013d3d58d777d2efb4e4685ba382f06abb985f96c708e3bf0de24ee24683f" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.1.0.tar.gz", "has_sig": false, "md5_digest": "f1876129b1e495c903d0089063432b39", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12727, "upload_time": "2017-11-14T03:30:13", "url": "https://files.pythonhosted.org/packages/5b/53/c62bf3ec22dc47dbb986a3fd2f1135cf6fcadf5b143f227bc10dd326aec6/scrapy-rotated-proxy-0.1.0.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "47db635790c51cfd5295a654eb88dcf1", "sha256": "62c907424bc136b488b1bb8889f2b7c0e8564d4c46fdbb241c668f50cb07eba1" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.1.2.tar.gz", "has_sig": false, "md5_digest": "47db635790c51cfd5295a654eb88dcf1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12942, "upload_time": "2018-08-22T06:25:55", "url": "https://files.pythonhosted.org/packages/1c/13/595c3576830986d0ba5b4a719022533457abde6fb0e1f11b4bd0e682ca00/scrapy-rotated-proxy-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "064bb808ff176afb856d1952a673e933", "sha256": "da528208b712fcfce81a0b7d172cdef4eb96d9620de94dbd5557f8663cd5d017" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.1.3.tar.gz", "has_sig": false, "md5_digest": "064bb808ff176afb856d1952a673e933", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12947, "upload_time": "2018-08-22T07:55:52", "url": "https://files.pythonhosted.org/packages/3e/2d/d7f6b35b6c9dcb9c10592467877139f2ae78b677c92a0a83d7c3614c2e15/scrapy-rotated-proxy-0.1.3.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "e60af00ef9c6e7eed1e3c3f710f46f45", "sha256": "e7e40a9836da8268b8247a99af390d523f9158927cd1063493d1a2987f385e3a" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.1.5.tar.gz", "has_sig": false, "md5_digest": "e60af00ef9c6e7eed1e3c3f710f46f45", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12947, "upload_time": "2018-08-23T04:18:38", "url": "https://files.pythonhosted.org/packages/e6/74/b30b4021909354613a5ef59c01b8547302e9daf9ddf5f272945a3313b41e/scrapy-rotated-proxy-0.1.5.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e60af00ef9c6e7eed1e3c3f710f46f45", "sha256": "e7e40a9836da8268b8247a99af390d523f9158927cd1063493d1a2987f385e3a" }, "downloads": -1, "filename": "scrapy-rotated-proxy-0.1.5.tar.gz", "has_sig": false, "md5_digest": "e60af00ef9c6e7eed1e3c3f710f46f45", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12947, "upload_time": "2018-08-23T04:18:38", "url": "https://files.pythonhosted.org/packages/e6/74/b30b4021909354613a5ef59c01b8547302e9daf9ddf5f272945a3313b41e/scrapy-rotated-proxy-0.1.5.tar.gz" } ] }