{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [], "description": "# Scrapy with selenium\n[![PyPI](https://img.shields.io/pypi/v/scrapy-selenium.svg)](https://pypi.python.org/pypi/scrapy-selenium) [![Build Status](https://travis-ci.org/clemfromspace/scrapy-selenium.svg?branch=master)](https://travis-ci.org/clemfromspace/scrapy-selenium) [![Test Coverage](https://api.codeclimate.com/v1/badges/5c737098dc38a835ff96/test_coverage)](https://codeclimate.com/github/clemfromspace/scrapy-selenium/test_coverage) [![Maintainability](https://api.codeclimate.com/v1/badges/5c737098dc38a835ff96/maintainability)](https://codeclimate.com/github/clemfromspace/scrapy-selenium/maintainability)\n\nScrapy middleware to handle javascript pages using selenium.\n\n## Installation\n```\n$ pip install scrapy-selenium\n```\nYou should use **python>=3.6**. \nYou will also need one of the Selenium [compatible browsers](http://www.seleniumhq.org/about/platforms.jsp).\n\n## Configuration\n1. Add the browser to use, the path to the driver executable, and the arguments to pass to the executable to the scrapy settings:\n ```python\n from shutil import which\n\n SELENIUM_DRIVER_NAME = 'firefox'\n SELENIUM_DRIVER_EXECUTABLE_PATH = which('geckodriver')\n SELENIUM_DRIVER_ARGUMENTS=['-headless'] # '--headless' if using chrome instead of firefox\n ```\n\nOptionally, set the path to the browser executable:\n ```python\n SELENIUM_BROWSER_EXECUTABLE_PATH = which('firefox')\n ```\n\n2. Add the `SeleniumMiddleware` to the downloader middlewares:\n ```python\n DOWNLOADER_MIDDLEWARES = {\n 'scrapy_selenium.SeleniumMiddleware': 800\n }\n ```\n## Usage\nUse the `scrapy_selenium.SeleniumRequest` instead of the scrapy built-in `Request` like below:\n```python\nfrom scrapy_selenium import SeleniumRequest\n\nyield SeleniumRequest(url, self.parse_result)\n```\nThe request will be handled by selenium, and the request will have an additional `meta` key, named `driver` containing the selenium driver with the request processed.\n```python\ndef parse_result(self, response):\n print(response.request.meta['driver'].title)\n```\nFor more information about the available driver methods and attributes, refer to the [selenium python documentation](http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.remote.webdriver)\n\nThe `selector` response attribute work as usual (but contains the html processed by the selenium driver).\n```python\ndef parse_result(self, response):\n print(response.selector.xpath('//title/@text'))\n```\n\n### Additional arguments\nThe `scrapy_selenium.SeleniumRequest` accept 4 additional arguments:\n\n#### `wait_time` / `wait_until`\n\nWhen used, selenium will perform an [Explicit wait](http://selenium-python.readthedocs.io/waits.html#explicit-waits) before returning the response to the spider.\n```python\nfrom selenium.webdriver.common.by import By\nfrom selenium.webdriver.support import expected_conditions as EC\n\nyield SeleniumRequest(\n url=url,\n callback=self.parse_result,\n wait_time=10,\n wait_until=EC.element_to_be_clickable((By.ID, 'someid'))\n)\n```\n\n#### `screenshot`\nWhen used, selenium will take a screenshot of the page and the binary data of the .png captured will be added to the response `meta`:\n```python\nyield SeleniumRequest(\n url=url,\n callback=self.parse_result,\n screenshot=True\n)\n\ndef parse_result(self, response):\n with open('image.png', 'wb') as image_file:\n image_file.write(response.meta['screenshot'])\n```\n\n#### `script`\nWhen used, selenium will execute custom JavaScript code.\n```python\nyield SeleniumRequest(\n url,\n self.parse_result,\n script='window.scrollTo(0, document.body.scrollHeight);',\n)\n```\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/clemfromspace/scrapy-selenium", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "scrapy-selenium", "package_url": "https://pypi.org/project/scrapy-selenium/", "platform": "", "project_url": "https://pypi.org/project/scrapy-selenium/", "project_urls": { "Homepage": "https://github.com/clemfromspace/scrapy-selenium" }, "release_url": "https://pypi.org/project/scrapy-selenium/0.0.7/", "requires_dist": [ "scrapy (>=1.0.0)", "selenium (>=3.9.0)" ], "requires_python": "", "summary": "Scrapy with selenium", "version": "0.0.7" }, "last_serial": 4737430, "releases": { "0.0.1.dev1": [ { "comment_text": "", "digests": { "md5": "e6533376d911ba204feabafc7dbf46ab", "sha256": "bee21a285941a09ece825b345e15a382065f5142fb4a065ef577c4215610b14f" }, "downloads": -1, "filename": "scrapy_selenium-0.0.1.dev1-py3-none-any.whl", "has_sig": false, "md5_digest": "e6533376d911ba204feabafc7dbf46ab", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8533, "upload_time": "2018-02-11T11:04:12", "url": "https://files.pythonhosted.org/packages/11/2d/84e8c3854326b653b71a0b8f92498fa8f0db9dbf372b7d8c06919ed96337/scrapy_selenium-0.0.1.dev1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "19ea1d1c511a99a411b8047c992bb39f", "sha256": "8976382abd1962bab91e00bbcc4f53a4ae2d8a7d82bf08a5b2c6a57e4b1713fd" }, "downloads": -1, "filename": "scrapy-selenium-0.0.1.dev1.tar.gz", "has_sig": false, "md5_digest": "19ea1d1c511a99a411b8047c992bb39f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5115, "upload_time": "2018-02-11T11:04:13", "url": "https://files.pythonhosted.org/packages/7c/32/b0d57c57d947410598b15189b47e41a86990c5ee20c052f7e9ba52438c1e/scrapy-selenium-0.0.1.dev1.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "bbcc2dd760dc1c3e94ae13e0fbfe287b", "sha256": "0ccc29d14cc3c4fc1d3070ce87cfa5ab7e807e326369de26ef0e405194d2e9c0" }, "downloads": -1, "filename": "scrapy_selenium-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "bbcc2dd760dc1c3e94ae13e0fbfe287b", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8461, "upload_time": "2018-02-11T12:07:06", "url": "https://files.pythonhosted.org/packages/a7/02/420553a100d2978ff81c67a964ad6e97b4d60eedf487190a6b47640f60f1/scrapy_selenium-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1a07f9af81ae07181f4104cc3e188bd6", "sha256": "2bcee4e51fcbe09e0f4187e9217783aa9fb3ab1700d498c17d9ec4b67fcbbb60" }, "downloads": -1, "filename": "scrapy-selenium-0.0.3.tar.gz", "has_sig": false, "md5_digest": "1a07f9af81ae07181f4104cc3e188bd6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5100, "upload_time": "2018-02-11T12:07:08", "url": "https://files.pythonhosted.org/packages/af/05/99ededa143dc4e55cb31dc526d341ec38982eba2de93f19bef290b9b3779/scrapy-selenium-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "bb0dad155b8cdc4f41952d701b13f356", "sha256": "e748a6ce8964de6bd1a0ed90434f0465479a8f6ce0f15cc6db98eff1dc249029" }, "downloads": -1, "filename": "scrapy_selenium-0.0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "bb0dad155b8cdc4f41952d701b13f356", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6410, "upload_time": "2018-09-06T07:52:21", "url": "https://files.pythonhosted.org/packages/58/77/1970935dfa6ea3a139709faa1d5ba8c6faeb8e3a6d76aa7525e4fc7111ff/scrapy_selenium-0.0.4-py3-none-any.whl" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "bcaeb57034231f5db962c29bf30ba4fe", "sha256": "e6235df80327b4f934bf9bce7f4a3fd2b48db8ce6b9af2f6ba3cd29eaa2093ba" }, "downloads": -1, "filename": "scrapy_selenium-0.0.5-py3-none-any.whl", "has_sig": false, "md5_digest": "bcaeb57034231f5db962c29bf30ba4fe", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6417, "upload_time": "2018-09-08T19:59:05", "url": "https://files.pythonhosted.org/packages/db/a1/bed5f1b668afe43fd51a545360ad0ac27580da13538fbdf9e8a03d831e9d/scrapy_selenium-0.0.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e8cefbb8bbdf4f951e8f79bb40740afb", "sha256": "f20ee3a2ebbedf43ee8fd5d0b441e6ba9c2aeecdd4186f20216f5eae9f90df3b" }, "downloads": -1, "filename": "scrapy-selenium-0.0.5.tar.gz", "has_sig": false, "md5_digest": "e8cefbb8bbdf4f951e8f79bb40740afb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5146, "upload_time": "2018-09-08T19:59:06", "url": "https://files.pythonhosted.org/packages/63/af/a6a660a3afed9c9e6a73c99034f4af3f84f45a98d0a1175bd0892b53f19e/scrapy-selenium-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "b1ce3b4787b4013f5c6d27786112bab4", "sha256": "a580ded10442699e40049c51ed8779a564ce350d8036279cd71140c7a0bb15af" }, "downloads": -1, "filename": "scrapy_selenium-0.0.6-py2-none-any.whl", "has_sig": false, "md5_digest": "b1ce3b4787b4013f5c6d27786112bab4", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 7018, "upload_time": "2019-01-03T20:54:48", "url": "https://files.pythonhosted.org/packages/30/6f/9672c5d1bed22e4d9a5d75f9b9d862b2c10ca9b0021f6e701ac927a2dbb1/scrapy_selenium-0.0.6-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f50156943eb25cdcea30d494872be9a3", "sha256": "38163f5a0077e47a98ee541b8a6d836997c46cf89cfeae807916e27fcf0aba36" }, "downloads": -1, "filename": "scrapy_selenium-0.0.6-py3-none-any.whl", "has_sig": false, "md5_digest": "f50156943eb25cdcea30d494872be9a3", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6526, "upload_time": "2019-01-03T20:54:50", "url": "https://files.pythonhosted.org/packages/68/51/d6ad60854ebd5f21fac65c856aed32143731f1a81d02e51bdcd1bd0b7e62/scrapy_selenium-0.0.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "62afa40e6dc80ab175be45a054e426b3", "sha256": "96a719664df3b11e5ae44abbe8a2cd8904918c9994e6006ad7eb38c6b78962b1" }, "downloads": -1, "filename": "scrapy-selenium-0.0.6.tar.gz", "has_sig": false, "md5_digest": "62afa40e6dc80ab175be45a054e426b3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5254, "upload_time": "2019-01-03T20:54:51", "url": "https://files.pythonhosted.org/packages/4d/c7/42ac3d09e853f834a7c037ed12f34de39f86d0264fb7501d9ada32e44da1/scrapy-selenium-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "c12b3a563424e29915b78c5a3addabf2", "sha256": "70766315c7970b12a142e1b7a9f43ffb3ef1260891811062ec9dd46a665d935a" }, "downloads": -1, "filename": "scrapy_selenium-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "c12b3a563424e29915b78c5a3addabf2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6723, "upload_time": "2019-01-24T21:40:17", "url": "https://files.pythonhosted.org/packages/2d/8f/066607f29d4b351c9dbb10d86f580d2d2dde2f24f7c96427dc681b14e741/scrapy_selenium-0.0.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e9872171640c5bf72e73defc2f29d0f6", "sha256": "51f809802a1f62ed852cfe2d2ed49f6141058cc5254ed4b448d2ffe6f7a1b6e9" }, "downloads": -1, "filename": "scrapy-selenium-0.0.7.tar.gz", "has_sig": false, "md5_digest": "e9872171640c5bf72e73defc2f29d0f6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5505, "upload_time": "2019-01-24T21:40:19", "url": "https://files.pythonhosted.org/packages/6e/36/b14b771d9238c054cc691c390c0d2c037436a3f3cbcb6de26c1be2ca8e2c/scrapy-selenium-0.0.7.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c12b3a563424e29915b78c5a3addabf2", "sha256": "70766315c7970b12a142e1b7a9f43ffb3ef1260891811062ec9dd46a665d935a" }, "downloads": -1, "filename": "scrapy_selenium-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "c12b3a563424e29915b78c5a3addabf2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6723, "upload_time": "2019-01-24T21:40:17", "url": "https://files.pythonhosted.org/packages/2d/8f/066607f29d4b351c9dbb10d86f580d2d2dde2f24f7c96427dc681b14e741/scrapy_selenium-0.0.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e9872171640c5bf72e73defc2f29d0f6", "sha256": "51f809802a1f62ed852cfe2d2ed49f6141058cc5254ed4b448d2ffe6f7a1b6e9" }, "downloads": -1, "filename": "scrapy-selenium-0.0.7.tar.gz", "has_sig": false, "md5_digest": "e9872171640c5bf72e73defc2f29d0f6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5505, "upload_time": "2019-01-24T21:40:19", "url": "https://files.pythonhosted.org/packages/6e/36/b14b771d9238c054cc691c390c0d2c037436a3f3cbcb6de26c1be2ca8e2c/scrapy-selenium-0.0.7.tar.gz" } ] }