{ "info": { "author": "qin", "author_email": "qin@qinxuye.me", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Environment :: Console", "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Topic :: Internet :: WWW/HTTP", "Topic :: Software Development :: Libraries :: Application Frameworks", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: System :: Clustering" ], "description": "===============================================\nCola: high-level distributed crawling framework\n===============================================\n\n.. image:: https://badge.fury.io/py/cola.svg\n :target: http://badge.fury.io/py/cola\n\nOverview\n========\n\n**Cola** is a high-level distributed crawling framework, \nused to crawl pages and extract structured data from websites.\nIt provides simple and fast yet flexible way to achieve your data acquisition objective.\nUsers only need to write one piece of code which can run under both local and distributed mode.\n\nRequirements\n------------\n\n* Python2.7 (Python3+ will be supported later)\n* Work on Linux, Windows and Mac OSX\n\nInstall\n=======\n\nThe quick way:\n\n::\n \n pip install cola\n\nOr, download source code, then run:\n\n::\n \n python setup.py install\n\nWrite applications\n==================\n\nDocuments will update soon, now just refer to the \n`wiki `_ or\n`weibo `_ application.\n\nRun applications\n================\n\nFor the wiki or weibo app, please ensure the installation of dependencies, weibo as an example:\n\n::\n\n pip install -r /path/to/cola/app/weibo/requirements.txt\n\nLocal mode\n----------\n\nIn order to let your application support local mode, just add code to the entrance as below.\n\n.. code-block:: python\n\n from cola.context import Context\n ctx = Context(local_mode=True)\n ctx.run_job(os.path.dirname(os.path.abspath(__file__)))\n\nThen run the application:\n\n::\n\n python __init__.py\n \nStop the local job by ``CTRL+C``.\n\nDistributed mode\n----------------\n\nStart master:\n\n::\n\n coca master -s [ip:port]\n\nStart one or more workers:\n\n::\n\n coca worker -s -m [ip:port]\n\nThen run the application(weibo as an example):\n\n::\n\n coca job -u /path/to/cola/app/weibo -r\n\nCoca command\n============\n\nCoca is a convenient command-line tool for the whole cola environment.\n\nmaster\n------\n\nKill master to stop the whole cluster:\n\n::\n\n coca master -k\n\njob\n---\n\nList all jobs:\n\n::\n\n coca job -m [ip:port] -l\n\nExample as:\n\n::\n\n list jobs at master: 10.211.55.2:11103\n ====> job id: 8ZcGfAqHmzc, job description: sina weibo crawler, status: stopped\n\nYou can run a job which shown in the list above:\n\n::\n\n coca job -r 8ZcGfAqHmzc\n\nActually, you don't have to input the complete job name:\n\n::\n\n coca job -r 8Z\n\nPart of the job name is fine if there's no conflict.\n\nYou can know the status of a running job by:\n\n::\n\n coca job -t 8Z\n\nThe status like counters during running and so on will be output \nto the terminal.\n\nYou can kill a job by the kill command:\n\n::\n\n coca job -k 8Z\n\nstartproject\n------------\n\nYou can create an application by this command:\n\n::\n\n coca startproject colatest\n\nRemember, help command will always be helpful:\n\n::\n\n coca -h\n\nor\n\n::\n\n coca master -h\n\n\nNotes\n=====\n\n`Chinese docs(wiki) `_.\n\nDonation\n========\n\nCola is a non-profit project and by now maintained by myself, \nthus any donation will be encouragement for the further improvements of cola project.\n\n**Alipay & Paypal: qinxuye@gmail.com**", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://colaframework.org", "keywords": null, "license": "Apache 2", "maintainer": null, "maintainer_email": null, "name": "Cola", "package_url": "https://pypi.org/project/Cola/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/Cola/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://colaframework.org" }, "release_url": "https://pypi.org/project/Cola/0.1.0/", "requires_dist": null, "requires_python": null, "summary": "A high-level distributed crawling framework", "version": "0.1.0" }, "last_serial": 2037298, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "d3070dd6317abb62daacbfece908ed34", "sha256": "0950e45ac291b832bd43eb2d669c3b4d07428e3be16f59d9962aba9a67419f26" }, "downloads": -1, "filename": "Cola-0.1.0.tar.gz", "has_sig": false, "md5_digest": "d3070dd6317abb62daacbfece908ed34", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 64261, "upload_time": "2016-03-31T04:19:26", "url": "https://files.pythonhosted.org/packages/2d/11/4456bfdf9e2000d19d903739028ad5c668827d22f91193838a7c0e856c9f/Cola-0.1.0.tar.gz" } ], "0.1.0a0": [], "0.1.0b0": [ { "comment_text": "", "digests": { "md5": "c8c50c9676248519f429de17dd8fe347", "sha256": "f1761373da32a471a403f8da5b72467eb34f6e7331771c5dc792b6b87963bf4c" }, "downloads": -1, "filename": "Cola-0.1.0b0.tar.gz", "has_sig": false, "md5_digest": "c8c50c9676248519f429de17dd8fe347", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 59047, "upload_time": "2015-07-05T17:06:27", "url": "https://files.pythonhosted.org/packages/82/cf/8ec37a55267ba72e9b7af4134aedaf76b4e459b0b366d2ad8f6c409bf7a5/Cola-0.1.0b0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d3070dd6317abb62daacbfece908ed34", "sha256": "0950e45ac291b832bd43eb2d669c3b4d07428e3be16f59d9962aba9a67419f26" }, "downloads": -1, "filename": "Cola-0.1.0.tar.gz", "has_sig": false, "md5_digest": "d3070dd6317abb62daacbfece908ed34", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 64261, "upload_time": "2016-03-31T04:19:26", "url": "https://files.pythonhosted.org/packages/2d/11/4456bfdf9e2000d19d903739028ad5c668827d22f91193838a7c0e856c9f/Cola-0.1.0.tar.gz" } ] }