{
    "info": {
        "author": "Mikhail Korobov",
        "author_email": "kmike84@gmail.com",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 3 - Alpha",
            "Intended Audience :: Developers",
            "License :: OSI Approved :: BSD License",
            "Natural Language :: English",
            "Operating System :: OS Independent",
            "Programming Language :: Python :: 3",
            "Programming Language :: Python :: 3.5",
            "Programming Language :: Python :: 3.6",
            "Programming Language :: Python :: 3.7"
        ],
        "description": "=======================\nscrapinghub-autoextract\n=======================\n\n.. image:: https://img.shields.io/pypi/v/scrapinghub-autoextract.svg\n   :target: https://pypi.python.org/pypi/scrapinghub-autoextract\n   :alt: PyPI Version\n\n.. image:: https://img.shields.io/pypi/pyversions/scrapinghub-autoextract.svg\n   :target: https://pypi.python.org/pypi/scrapinghub-autoextract\n   :alt: Supported Python Versions\n\n.. image:: https://travis-ci.org/scrapinghub/scrapinghub-autoextract.svg?branch=master\n   :target: https://travis-ci.org/scrapinghub/scrapinghub-autoextract\n   :alt: Build Status\n\n.. image:: https://codecov.io/github/scrapinghub/scrapinghub-autoextract/coverage.svg?branch=master\n   :target: https://codecov.io/gh/scrapinghub/scrapinghub-autoextract\n   :alt: Coverage report\n\n\nPython client libraries for `Scrapinghub AutoExtract API`_.\nIt allows to extract product and article information from any website.\n\nBoth synchronous and asyncio wrappers are provided by this package.\n\nLicense is BSD 3-clause.\n\n.. _Scrapinghub AutoExtract API: https://scrapinghub.com/autoextract\n\n\nInstallation\n============\n\n::\n\n    pip install scrapinghub-autoextract\n\nscrapinghub-autoextract requires Python 3.6+ for CLI tool and for\nthe asyncio API; basic, synchronous API works with Python 3.5.\n\nUsage\n=====\n\nFirst, make sure you have an API key. To avoid passing it in ``api_key``\nargument with every call, you can set ``SCRAPINGHUB_AUTOEXTRACT_KEY``\nenvironment variable with the key.\n\nCommand-line interface\n----------------------\n\nThe most basic way to use the client is from a command line.\nFirst, create a file with urls, an URL per line (e.g. ``urls.txt``).\nSecond, set ``SCRAPINGHUB_AUTOEXTRACT_KEY`` env variable with your\nAutoExtract API key (you can also pass API key as ``--api-key`` script\nargument).\n\nThen run a script, to get the results::\n\n    python -m autoextract urls.txt --page-type article > res.jl\n\nRun ``python -m autoextract --help`` to get description of all supported\noptions.\n\nSynchronous API\n---------------\n\nSynchronous API provides an easy way to try autoextract in a script.\nFor production usage asyncio API is strongly recommended.\n\nYou can send requests as described in `API docs`_::\n\n    from autoextract.sync import request_raw\n    query = [{'url': 'http://example.com.foo', 'pageType': 'article'}]\n    results = request_raw(query)\n\nNote that if there are several URLs in the query, results can be returned in\narbitrary order.\n\nThere is also a ``autoextract.sync.request_batch`` helper, which accepts URLs\nand page type, and ensures results are in the same order as requested URLs::\n\n    from autoextract.sync import request_batch\n    urls = ['http://example.com/foo', 'http://example.com/bar']\n    results = request_batch(urls, page_type='article')\n\n.. note::\n    Currently request_batch is limited to 100 URLs at time only.\n\n.. _API docs: https://doc.scrapinghub.com/autoextract.html\n\n\nasyncio API\n-----------\n\nBasic usage is similar to sync API (``request_raw``),\nbut asyncio event loop is used::\n\n    from autoextract.aio import request_raw\n\n    async def foo():\n        results1 = await request_raw(query)\n        # ...\n\nThere is also ``request_parallel`` function, which allows to process\nmany URLs in parallel, using both batching and multiple connections::\n\n    import sys\n    from autoextract.aio import request_parallel, create_session\n\n    async def foo():\n        async with create_session() as session:\n            res_iter = request_parallel(urls, page_type='article',\n                                        n_conn=10, batch_size=3,\n                                        session=session)\n            for f in res_iter:\n                try:\n                    batch_result = await f\n                    for res in batch_result:\n                        # do something with a result\n                except ApiError as e:\n                    print(e, file=sys.stderr)\n                    raise\n\n``request_parallel`` and ``request_raw`` functions handle throttling\n(http 429 errors) and network errors, retrying a request in these cases.\n\nCLI interface implementation (``autoextract/__main__.py``) can serve\nas an usage example.\n\nContributing\n============\n\n* Source code: https://github.com/scrapinghub/scrapinghub-autoextract\n* Issue tracker: https://github.com/scrapinghub/scrapinghub-autoextract/issues\n\nUse tox_ to run tests with different Python versions::\n\n    tox\n\nThe command above also runs type checks; we use mypy.\n\n.. _tox: https://tox.readthedocs.io\n\n\nChanges\n=======\n\nTBA\n---\n\nInitial release.\n\n",
        "description_content_type": "",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/scrapinghub/scrapinghub-autoextract",
        "keywords": "",
        "license": "",
        "maintainer": "",
        "maintainer_email": "",
        "name": "scrapinghub-autoextract",
        "package_url": "https://pypi.org/project/scrapinghub-autoextract/",
        "platform": "",
        "project_url": "https://pypi.org/project/scrapinghub-autoextract/",
        "project_urls": {
            "Homepage": "https://github.com/scrapinghub/scrapinghub-autoextract"
        },
        "release_url": "https://pypi.org/project/scrapinghub-autoextract/0.1/",
        "requires_dist": [
            "requests",
            "tenacity ; python_version >= \"3.6\"",
            "aiohttp (>=3.6.0) ; python_version >= \"3.6\"",
            "tqdm ; python_version >= \"3.6\""
        ],
        "requires_python": "",
        "summary": "Python interface to Scrapinghub Automatic Extraction API",
        "version": "0.1"
    },
    "last_serial": 5951388,
    "releases": {
        "0.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "731b2061c99f5b9ef99cf61bed9d2b19",
                    "sha256": "f0a9e69c49e5f1e3d1cdfa6069c322b1d9fa8d10c59a422295aa34cf74c14672"
                },
                "downloads": -1,
                "filename": "scrapinghub_autoextract-0.1-py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "731b2061c99f5b9ef99cf61bed9d2b19",
                "packagetype": "bdist_wheel",
                "python_version": "py3",
                "requires_python": null,
                "size": 11964,
                "upload_time": "2019-10-09T18:16:23",
                "url": "https://files.pythonhosted.org/packages/30/ef/71ab8223947762163e062a0c79ce5019cce474831e77cf19d1fafd97e2d2/scrapinghub_autoextract-0.1-py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "18ad64552554031e4bc6b67efc4d3677",
                    "sha256": "672e67b9443aa5ab78345de212b273f92031c95688474b58b0b3fe46ba2d13fa"
                },
                "downloads": -1,
                "filename": "scrapinghub-autoextract-0.1.tar.gz",
                "has_sig": false,
                "md5_digest": "18ad64552554031e4bc6b67efc4d3677",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 11042,
                "upload_time": "2019-10-09T18:16:27",
                "url": "https://files.pythonhosted.org/packages/81/1c/826a9aa957870fc84f1306ecc3b7d71a9eb4a57b254eb31b5e0813985d1c/scrapinghub-autoextract-0.1.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "731b2061c99f5b9ef99cf61bed9d2b19",
                "sha256": "f0a9e69c49e5f1e3d1cdfa6069c322b1d9fa8d10c59a422295aa34cf74c14672"
            },
            "downloads": -1,
            "filename": "scrapinghub_autoextract-0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "731b2061c99f5b9ef99cf61bed9d2b19",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 11964,
            "upload_time": "2019-10-09T18:16:23",
            "url": "https://files.pythonhosted.org/packages/30/ef/71ab8223947762163e062a0c79ce5019cce474831e77cf19d1fafd97e2d2/scrapinghub_autoextract-0.1-py3-none-any.whl"
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "18ad64552554031e4bc6b67efc4d3677",
                "sha256": "672e67b9443aa5ab78345de212b273f92031c95688474b58b0b3fe46ba2d13fa"
            },
            "downloads": -1,
            "filename": "scrapinghub-autoextract-0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "18ad64552554031e4bc6b67efc4d3677",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 11042,
            "upload_time": "2019-10-09T18:16:27",
            "url": "https://files.pythonhosted.org/packages/81/1c/826a9aa957870fc84f1306ecc3b7d71a9eb4a57b254eb31b5e0813985d1c/scrapinghub-autoextract-0.1.tar.gz"
        }
    ]
}