{ "info": { "author": "Theotime Leveque", "author_email": "theotime.leveque@gmail.com", "bugtrack_url": null, "classifiers": [ "Environment :: Web Environment", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Topic :: Internet :: WWW/HTTP :: Dynamic Content" ], "description": "Cabu\n====\n\n.. image:: https://readthedocs.org/projects/cabu/badge/?version=latest\n :target: http://cabu.readthedocs.org/en/latest/?badge=latest\n :alt: Documentation Status\n\nCabu is a simple microservice framework to remotely crawl websites.\nIt's built on Flask and Selenium, contains a virtual display wrapper and few methods.\n\n`Full documentation here`_\n\nUsage\n=====\n\n.. code-block:: python\n\n @app.route('/gizmodo_last_articles_links')\n def gizmodo_last_articles():\n app.webdriver.get('http://www.gizmodo.com')\n articles_links = [i.get_attribute('href') for i in app.webdriver.find_elements_by_css_selector('h1.headline>a')]\n\n return jsonify({'articles': articles_links})\n\n\nInstalling\n==========\n\n\n.. code-block:: console\n\n $ pip install cabu\n\nFeatures\n========\n\n- Selenium configuration out of the box\n- Flask wrapping\n- Crawling methods included\n- AWS S3 Export\n- FTP / FTPS\n- Cookies persistence\n- Link extractor\n- Proxy configuration\n- Headless optional for local debug\n- Docker pre-configured distributed environment\n- Database handler\n- Compatible with most Flask extensions (Flask-Admin, Flask-Mail, Flask-OAuth, ...)\n- 12 Factors compliance\n\n(Likely to come soon)\n\n- CouchDB support\n- Couchbase support\n- Mobile drivers\n- SFTP\n- HtmlUnit web driver\n- Remote webdriver wrapper\n- Parallelization\n- Neural Network plugins\n\n\nTesting\n=======\n\nAll tests were written using Docker services instead of Mocks.\nAlternative mocks will be added soon ;)\n\n.. code-block:: console\n\n $ pip install -r requirements-dev.txt\n $ py.test cabu/tests\n\nContributing\n============\n\nPlease see the `Contribute page`_.\n\nCopyright\n=========\n\nCabu is an open source project by `Th\u00e9otime L\u00e9v\u00e8que`_.\n\n\n.. _`Full documentation here`: https://cabu.readthedocs.org/\n.. _`Contribute page`: https://cabu.readthedocs.org/contribute\n.. _`Th\u00e9otime L\u00e9v\u00e8que`: https://github.com/thylong\n", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/thylong/cabu", "keywords": null, "license": "BSD", "maintainer": null, "maintainer_email": null, "name": "cabu", "package_url": "https://pypi.org/project/cabu/", "platform": "any", "project_url": "https://pypi.org/project/cabu/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://github.com/thylong/cabu" }, "release_url": "https://pypi.org/project/cabu/0.0.2/", "requires_dist": null, "requires_python": null, "summary": "cabu is a simple REST microservice to scrap content from anywhere.", "version": "0.0.2" }, "last_serial": 1959758, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "264de093629c61536046ceaf52c89b2b", "sha256": "545fee57cd04fd2afc2b8e0dc60d00f4e08ca226ce54eb624fa23d0ab3303874" }, "downloads": -1, "filename": "cabu-0.0.1.tar.gz", "has_sig": false, "md5_digest": "264de093629c61536046ceaf52c89b2b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5845, "upload_time": "2016-02-05T22:41:50", "url": "https://files.pythonhosted.org/packages/b3/7e/1e762ea92d98dffbe301472066cbd372efb40fd4c2b85a3427d23270d59b/cabu-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "8bb0d61deef3948e5c688afbc35c2b81", "sha256": "ffbcaa80afcf7f4eb7c930e7682889853668dd2b34fd89b30d9cfe15bed6f41a" }, "downloads": -1, "filename": "cabu-0.0.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "8bb0d61deef3948e5c688afbc35c2b81", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 8133, "upload_time": "2016-02-16T16:50:30", "url": "https://files.pythonhosted.org/packages/a2/60/1b6942169220ee8cac4983870e4ab550d31a50732137e0edf045446bd5ee/cabu-0.0.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f524930414d53f1902fba04f963b9bf8", "sha256": "56cfb267fa81fe8abb0be1c21f64f839b6ae2c85256943458e722316164db519" }, "downloads": -1, "filename": "cabu-0.0.2.tar.gz", "has_sig": false, "md5_digest": "f524930414d53f1902fba04f963b9bf8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6072, "upload_time": "2016-02-16T16:50:13", "url": "https://files.pythonhosted.org/packages/53/11/1a7f1fadf48c3713badee34d2a3a646d7436da37a109669e2a0ff4d15daf/cabu-0.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8bb0d61deef3948e5c688afbc35c2b81", "sha256": "ffbcaa80afcf7f4eb7c930e7682889853668dd2b34fd89b30d9cfe15bed6f41a" }, "downloads": -1, "filename": "cabu-0.0.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "8bb0d61deef3948e5c688afbc35c2b81", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 8133, "upload_time": "2016-02-16T16:50:30", "url": "https://files.pythonhosted.org/packages/a2/60/1b6942169220ee8cac4983870e4ab550d31a50732137e0edf045446bd5ee/cabu-0.0.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f524930414d53f1902fba04f963b9bf8", "sha256": "56cfb267fa81fe8abb0be1c21f64f839b6ae2c85256943458e722316164db519" }, "downloads": -1, "filename": "cabu-0.0.2.tar.gz", "has_sig": false, "md5_digest": "f524930414d53f1902fba04f963b9bf8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6072, "upload_time": "2016-02-16T16:50:13", "url": "https://files.pythonhosted.org/packages/53/11/1a7f1fadf48c3713badee34d2a3a646d7436da37a109669e2a0ff4d15daf/cabu-0.0.2.tar.gz" } ] }