{ "info": { "author": "Povilas Balciunas", "author_email": "balciunas90@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "=====\nAbout\n=====\n\n.. image:: https://travis-ci.org/povilasb/scrapy-html-storage.svg?branch=master\n.. image:: https://coveralls.io/repos/github/povilasb/scrapy-html-storage/badge.svg?branch=master :target: https://coveralls.io/github/povilasb/scrapy-html-storage?branch=master\n\nThis is Scrapy downloader middleware that stores response HTMLs to disk.\n\nUsage\n=====\n\nTurn downloader on, e.g. specifying it in `settings.py`::\n\n DOWNLOADER_MIDDLEWARES = {\n 'scrapy_html_storage.HtmlStorageMiddleware': 10,\n }\n\nNone of responses by default are saved to disk.\nYou must select for which requests the response HTMLs will be saved::\n\n def parse(self, response):\n \"\"\"Processes start urls.\n\n Args:\n response (HtmlResponse): scrapy HTML response object.\n \"\"\"\n yield scrapy.Request(\n 'http://target.com',\n callback=self.parse_target,\n meta={\n 'save_html': True,\n }\n )\n\nThe file path where HTML will be stored is resolved with spider method\n`response_html_path`. E.g.::\n\n class TargetSpider(scrapy.Spider):\n def response_html_path(self, request):\n \"\"\"\n Args:\n request (scrapy.http.request.Request): request that produced the\n response.\n \"\"\"\n return 'html/last_response.html'\n\nConfiguration\n=============\n\nHTML storage downloader middleware supports such options:\n\n* **gzip_output** (bool) - if True, HTML output will be stored in gzip format.\n Default is False.\n* **save_html_on_status** (list) - if not empty, sets list of response codes\n whitelisted for html saving. If list is empty or not provided, all response\n codes will be allowed for html saving.\n\nSample::\n\n HTML_STORAGE = {\n 'gzip_output': True,\n 'save_html_on_status': [200, 202]\n }", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/povilasb/scrapy-html-storage", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "scrapy-html-storage", "package_url": "https://pypi.org/project/scrapy-html-storage/", "platform": "", "project_url": "https://pypi.org/project/scrapy-html-storage/", "project_urls": { "Homepage": "https://github.com/povilasb/scrapy-html-storage" }, "release_url": "https://pypi.org/project/scrapy-html-storage/0.4.0/", "requires_dist": null, "requires_python": "", "summary": "Scrapy downloader middleware that stores response HTML files to disk.", "version": "0.4.0" }, "last_serial": 3996884, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "2e1df27ee6fa7804414ee6fee2114ae4", "sha256": "7cfb91dbe908cacfef823805461ee2d454f59e2265c266abfbde2423f63a4a57" }, "downloads": -1, "filename": "scrapy-html-storage-0.1.0.tar.gz", "has_sig": false, "md5_digest": "2e1df27ee6fa7804414ee6fee2114ae4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2178, "upload_time": "2016-03-29T14:43:35", "url": "https://files.pythonhosted.org/packages/5e/f5/23662638faabbc1efd9cdaa4b66abfc2df10589beec5cf96cb0cfe34cf38/scrapy-html-storage-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "6fe36efe62531da9a9191ba64ccddc6b", "sha256": "5d40667141685669a965dfff6dbe640f6c6f1e2cf5319ce8ddf0c39a5d3536a5" }, "downloads": -1, "filename": "scrapy-html-storage-0.2.0.tar.gz", "has_sig": false, "md5_digest": "6fe36efe62531da9a9191ba64ccddc6b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2601, "upload_time": "2016-04-19T11:22:48", "url": "https://files.pythonhosted.org/packages/65/16/962e62c2a5e4904e7ca1819f8a1ec6fc949fa5cbfd0fc12f72213ae0daf9/scrapy-html-storage-0.2.0.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "26065478238ab8d2505255c8946dd2a7", "sha256": "72174c5a606b49992aca5cecebc9ebb16abbe9ea63fed5b1b689825aa2fc2208" }, "downloads": -1, "filename": "scrapy-html-storage-0.3.0.tar.gz", "has_sig": false, "md5_digest": "26065478238ab8d2505255c8946dd2a7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2927, "upload_time": "2016-11-11T14:47:33", "url": "https://files.pythonhosted.org/packages/66/34/453862cb43500cf1efcb81933ed472e714eef236d8ff30b1679a4c1daf0f/scrapy-html-storage-0.3.0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "81fff2eec7d59dcc8770ca857e0ed40a", "sha256": "b7b55bb5025efe8c545b4d9fdcf357bb9a4bbaa17265f68ec093615dd5672fdc" }, "downloads": -1, "filename": "scrapy-html-storage-0.4.0.tar.gz", "has_sig": false, "md5_digest": "81fff2eec7d59dcc8770ca857e0ed40a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3000, "upload_time": "2018-06-24T14:10:46", "url": "https://files.pythonhosted.org/packages/78/15/f7a99fbfa63298323b8288695f34c21095410d4739a5368fd0b0bb0f7f1c/scrapy-html-storage-0.4.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "81fff2eec7d59dcc8770ca857e0ed40a", "sha256": "b7b55bb5025efe8c545b4d9fdcf357bb9a4bbaa17265f68ec093615dd5672fdc" }, "downloads": -1, "filename": "scrapy-html-storage-0.4.0.tar.gz", "has_sig": false, "md5_digest": "81fff2eec7d59dcc8770ca857e0ed40a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3000, "upload_time": "2018-06-24T14:10:46", "url": "https://files.pythonhosted.org/packages/78/15/f7a99fbfa63298323b8288695f34c21095410d4739a5368fd0b0bb0f7f1c/scrapy-html-storage-0.4.0.tar.gz" } ] }