{ "info": { "author": "Balthazar Rouberol", "author_email": "balthazar@mapado.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: No Input/Output (Daemon)", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python" ], "description": "``scrapy-redirect`` restricts authorized HTTP redirections to the website start_urls\n\nWhy?\n----\n\nIf the `Scrapy `_ ``REDIRECT_ENABLED`` config key is set to ``False`` and a request to the homepage of the crawled website returns a `3XX status code `_, the crawl will stop immediatly, as the redirection will not be followed.\n\n``scrapy-redirect`` will force Scrapy to tolerate redirections coming from the ``start_urls`` urls, in the case where ``REDIRECT_ENABLED = False``, to avoid this particular problem.\n\nInstallation\n------------\n\n.. code-block:: bash\n\n $ pip install scrapy-redirect\n\n\nConfiguration\n--------------\n\nInstall ``scrapy-redirect`` in your Scrapy middlewares by adding the following key/value pair in the ``SPIDER_MIDDLEWARES`` settings key (in ``settings.py``):\n\n.. code-block:: python\n\n SPIDER_MIDDLEWARES = {\n ...\n 'scrapyredirect.HomepageRedirectMiddleware': 575,\n ...\n }\n\nNote that it is important for the middleware order value to be inferior to 600 (the `default value `_ of the ``'scrapy.contrib.downloadermiddleware.redirect.RedirectMiddleware'`` middleware), as it must be executed before Scrapy blocks the redirection.\n\nNB: if ``REDIRECT_ENABLED = True``, ``scrapy-redirect`` does nothing.\n\nLicense\n-------\n\n``scrapy-redirect`` is published under the MIT License.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/mapado/scrapy-redirect", "keywords": "scrapy crawl scraping", "license": "The MIT License (MIT)", "maintainer": null, "maintainer_email": null, "name": "scrapy-redirect", "package_url": "https://pypi.org/project/scrapy-redirect/", "platform": "Any", "project_url": "https://pypi.org/project/scrapy-redirect/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://github.com/mapado/scrapy-redirect" }, "release_url": "https://pypi.org/project/scrapy-redirect/0.1.0/", "requires_dist": null, "requires_python": null, "summary": "Restrict authorized Scrapy redirections to the website start_urls", "version": "0.1.0" }, "last_serial": 856507, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "baf105f87f301ed9c53881c2690f1a07", "sha256": "4eb733d867925794b0598738da518374d35199db5af4a04f62235352db67b2d2" }, "downloads": -1, "filename": "scrapy-redirect-0.1.0.tar.gz", "has_sig": false, "md5_digest": "baf105f87f301ed9c53881c2690f1a07", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3250, "upload_time": "2013-09-04T09:43:39", "url": "https://files.pythonhosted.org/packages/d0/37/1b6b6ee64a0fbf37eb408c298dfddb04b22f5b2f989d8a6b948d32c70780/scrapy-redirect-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "baf105f87f301ed9c53881c2690f1a07", "sha256": "4eb733d867925794b0598738da518374d35199db5af4a04f62235352db67b2d2" }, "downloads": -1, "filename": "scrapy-redirect-0.1.0.tar.gz", "has_sig": false, "md5_digest": "baf105f87f301ed9c53881c2690f1a07", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3250, "upload_time": "2013-09-04T09:43:39", "url": "https://files.pythonhosted.org/packages/d0/37/1b6b6ee64a0fbf37eb408c298dfddb04b22f5b2f989d8a6b948d32c70780/scrapy-redirect-0.1.0.tar.gz" } ] }