{
    "info": {
        "author": "Nicholas Brochu",
        "author_email": "nicholas@serpent.ai",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 5 - Production/Stable",
            "Intended Audience :: Developers",
            "License :: OSI Approved :: Apache Software License",
            "Natural Language :: English",
            "Programming Language :: Python",
            "Programming Language :: Python :: 2.7",
            "Programming Language :: Python :: 3",
            "Programming Language :: Python :: 3.2",
            "Programming Language :: Python :: 3.3",
            "Programming Language :: Python :: 3.4",
            "Programming Language :: Python :: 3.5",
            "Programming Language :: Python :: 3.6"
        ],
        "description": "selenium-respectful\n===================\n\n`Selenium <http://www.seleniumhq.org/>`__ is already a well-known player\nwhen it comes to browser automation and is usually a part of any serious\nintegration testing stack. That being said, it is also an increasingly\npopular choice for web scraping. Now APIs are usually the ones with\ndetailed rate-limiting systems but if you are the type of person that\nunderstands that web scraping is sometimes necessary and you intend to\nbe relatively gentle about it, *selenium-respectful* might be for you!\n\n***selenium-respectful***:\n\n-  Is a minimalist wrapper for any *Selenium Webdriver* to work within\n   rate limits of any amount of services simultaneously\n-  Can scale out of a single thread, single process or even a single\n   machine\n-  Enables maximizing your allowed requests without ever going over set\n   limits and having to handle the fallout\n-  Overloads the Webdriver's *get* method and relays any other valid\n   calls\n-  Works with both Python 2 and 3 and is thoroughly tested\n-  Is a sister library to the already established\n   `requests-respectful <https://github.com/SerpentAI/requests-respectful>`__\n\n**Typical *Selenium* get call**\n\n.. code:: python\n\n    from selenium.webdriver.chrome.webdriver import WebDriver\n    driver = WebDriver()\n\n    driver.get(\"http://github.com\")\n    element = driver.find_element_by_tag_name(\"body\")\n\n**Get call with *selenium-respectful***\n\n.. code:: python\n\n    from selenium.webdriver.chrome.webdriver import WebDriver\n    from selenium_respectful import RespectfulWebdriver\n\n    driver = RespectfulWebdriver(webdriver=WebDriver())\n\n    # This can be done elsewhere but the realm needs to be registered!\n    driver.register_realm(\"Github\", max_requests=100, timespan=60)\n\n    driver.get(\"http://github.com\", realms=[\"Github\"], wait=True)\n    element = driver.find_element_by_tag_name(\"body\")  # Works as usual!\n\nRequirements\n------------\n\n-  `Redis <http://redis.io/>`__ > 2.8.0 (See FAQ if you are rolling your\n   eyes)\n\nInstallation\n------------\n\n.. code:: shell\n\n    pip install selenium-respectful\n\nConfiguration\n-------------\n\nDefault Configuration Values\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n    {\n        \"redis\": {\n            \"host\": \"localhost\",\n            \"port\": 6379,\n            \"database\": 0\n        },\n        \"safety_threshold\": 10\n    }\n\nConfiguration Keys\n~~~~~~~~~~~~~~~~~~\n\n-  **redis**: Provides the ``host``, ``port``\\ and ``database`` of the\n   Redis instance\n-  **safety\\_threshold**: A rate-limited exception will be raised at\n   *(realm\\_max\\_requests - safety\\_threshold)*. Prevents going over the\n   limit of services in scenarios where a large amount of requests are\n   issued in parallel\n\nOverriding Configuration Values\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWith *selenium-respectful.config.yml*\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nThe library auto-detects the presence of a YAML file named\n*selenium-respectful.config.yml* at the root of your project and will\nattempt to load configuration values from it.\n\n**Example**:\n\nselenium-respectful.config.yml\n\n.. code:: yaml\n\n    redis:\n        host: 0.0.0.0\n        port: 6379\n        database: 5\n\n    safety_threshold: 25\n\n**The resulting active configuration would be:**\n\n.. code:: python\n\n    RespectfulWebdriver(webdriver=WebDriver()).config\n\n    Out[1]: {\n        \"redis\": {\n            \"host\": \"0.0.0.0\",\n            \"port\": 6379,\n            \"database\": 5\n        },\n        \"safety_threshold\": 25\n    }\n\nUsage\n-----\n\nIn your quest to use *selenium-respectful*, you should only ever have to\nbother with one class: *RespectfulWebdriver*. Instance this class and\nyou can perform all important operations.\n\nBefore each example, it is assumed that the following code has already\nbeen executed.\n\n.. code:: python\n\n    from selenium.webdriver.chrome.webdriver import WebDriver\n    from selenium_respectful import RespectfulWebdriver\n\n    driver = RespectfulWebdriver(webdriver=WebDriver())\n\nRealms\n~~~~~~\n\nRealms are simply named containers that are provided with a maximum\nrequesting rate. You are responsible of the management (i.e. CRUD) of\nyour realms.\n\nRealms track the HTTP requests that are performed under them and will\nraise a catchable rate limit exception if you are over their allowed\nrequesting rate.\n\nFetching the list of Realms\n^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    driver.fetch_registered_realms()\n\nThis returns a list of currently registered realm names.\n\nRegistering a Realm\n^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    driver.register_realm(\"Google\", max_requests=10, timespan=1)\n    driver.register_realm(\"Github\", max_requests=100, timespan=60)\n    driver.register_realm(\"Twitter\", max_requests=150, timespan=300)\n\n    # OR\n    realm_tuples = [\n        [\"Google\", 10, 1],\n        [\"Github\", 100, 60],\n        [\"Twitter\", 150, 300]\n    ]\n\n    driver.register_realms(realm_tuples)\n\nEither of these registers 3 realms: \\* *Google* at a maximum requesting\nrate of 10 requests per second \\* *Github* at a maximum requesting rate\nof 100 requests per minute \\* *Twitter* at a maximum requesting rate of\n150 requests per 5 minutes\n\nUpdating a Realm\n^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    driver.update_realm(\"Google\", max_requests=25, timespan=5)\n\nThis updates the maximum requesting rate of *Google* to 25 requests per\n5 seconds.\n\nGetting the maximum requests value of a Realm\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    driver.realm_max_requests(\"Google\")\n\nThis would return 25.\n\nGetting the timespan value of a Realm\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    driver.realm_timespan(\"Google\")\n\nThis would return 5.\n\nUnregistering a Realm\n^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    driver.unregister_realm(\"Google\")\n\nThis would unregister the *Google* realm, preventing further queries\nfrom executing on it.\n\nUnregistering multiple Realms\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    driver.unregister_realms([\"Google\", \"Github\", \"Twitter\"])\n\nThis would unregister all 3 realms in one operation, preventing further\nqueries from executing on them.\n\nRequesting\n~~~~~~~~~~\n\nUsing the *Selenium Webdriver* get method\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nTo pilot your web browser to a given URL, just use the *get* method as\nyou would normally do with your WebDriver instance. The only major\ndifference is that a *realms* kwarg is expected. A *wait* boolean kwargs\ncan also be provided (the behavior is explained later).\n\nExample of a valid call:\n\n.. code:: python\n\n    driver.get(\"http://github.com\", realms=[\"GitHub\"])\n\nIf not rate-limited, it would direct the browser to the provided URL.\n\nMultiple realms per request\n^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nYou can have a single request count against multiple realms if it makes\nsense in your use case.\n\n.. code:: python\n\n    driver.get(\"http://github.com\", realms=[\"GitHub\", \"GitHubUser123\", \"GitHubServer3\"])\n\nHandling exceptions\n^^^^^^^^^^^^^^^^^^^\n\nExecuting these *get* calls will either perform the action in the\nbrowser or raise a SeleniumRespectfulRateLimitedError exception. This\nmeans that you'll likely want to catch and handle that exception.\n\n.. code:: python\n\n    from selenium_respectful import SeleniumRespectfulRateLimitedError\n\n    try:\n        driver.get(\"http://github.com\", realm=\"GitHub\")\n    except SeleniumRespectfulRateLimitedError:\n        pass # Possibly requeue that call or wait.\n\nThe *wait* kwarg\n^^^^^^^^^^^^^^^^\n\nRequesting with a *get* call accepts a *wait* kwarg that defaults to\nFalse. If switched on and the realm is currently rate-limited, the\nprocess will block, wait until it is safe to send requests again and\nperform the requests then. Waiting is perfectly fine for scripts or\nsmaller operations but is discouraged for large, multi-realm, parallel\ntasks (i.e. Background Tasks like Celery workers).\n\nTests\n-----\n\n-  Exist? ``Yes``\n-  Exhaustive? ``Yes``\n-  Facepalm tactics?\n   ``Yes -  Redis calls aren't mocked and google.com gets a few friendly calls``\n\nRun them with ``python -m pytest tests --spec``\n\nFAQ\n---\n\nWhoa, whoa, whoa! Redis?!\n~~~~~~~~~~~~~~~~~~~~~~~~~\n\nYes. The use of Redis allows for *selenium-respectful* to go\nmulti-thread, multi-process and even multi-machine while still\nrespecting the maximum requesting rates of registered realms. Operations\nlike Redis' SETEX are key in designing and working with rate-limiting\nsystems. If you are doing Python development, there is a decent chance\nyou already work with Redis as it is one of the two options to use as\nCelery's backend and one of the 2 major caching options in Web\ndevelopment. If not, you can always keep things clean and use a `Docker\nContainer <https://hub.docker.com/_/redis/>`__ or even `build it from\nsource <http://redis.io/download#installation>`__. Redis has kept a\nconsistent record over the years of being lightweight, solid software.\n",
        "description_content_type": null,
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/nbrochu/selenium-respectful",
        "keywords": "",
        "license": "Apache License v2",
        "maintainer": "",
        "maintainer_email": "",
        "name": "selenium-respectful",
        "package_url": "https://pypi.org/project/selenium-respectful/",
        "platform": "",
        "project_url": "https://pypi.org/project/selenium-respectful/",
        "project_urls": {
            "Homepage": "https://github.com/nbrochu/selenium-respectful"
        },
        "release_url": "https://pypi.org/project/selenium-respectful/0.1.0/",
        "requires_dist": null,
        "requires_python": "",
        "summary": "Minimalist Selenium webdriver wrapper to work within rate limits of any amount of services simultaneously. Parallel processing friendly.",
        "version": "0.1.0"
    },
    "last_serial": 2714847,
    "releases": {
        "0.1.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "e272bf5ebda23d07d501beff9f7b1456",
                    "sha256": "49bac0ca326f30810617e202c6a6f66e91de5370aca2a35729f432b4fd7a7533"
                },
                "downloads": -1,
                "filename": "selenium-respectful-0.1.0.tar.gz",
                "has_sig": false,
                "md5_digest": "e272bf5ebda23d07d501beff9f7b1456",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 6734,
                "upload_time": "2017-03-18T18:23:11",
                "url": "https://files.pythonhosted.org/packages/de/f4/b94e1ed12e4a242e4db1b189531e09adae1b431d6498b6cf785e3ab7265c/selenium-respectful-0.1.0.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "e272bf5ebda23d07d501beff9f7b1456",
                "sha256": "49bac0ca326f30810617e202c6a6f66e91de5370aca2a35729f432b4fd7a7533"
            },
            "downloads": -1,
            "filename": "selenium-respectful-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e272bf5ebda23d07d501beff9f7b1456",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 6734,
            "upload_time": "2017-03-18T18:23:11",
            "url": "https://files.pythonhosted.org/packages/de/f4/b94e1ed12e4a242e4db1b189531e09adae1b431d6498b6cf785e3ab7265c/selenium-respectful-0.1.0.tar.gz"
        }
    ]
}