{ "info": { "author": "Andy Roche", "author_email": "andy@roche.io", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# wiki-table-scrape\n\nScrape HTML tables from a Wikipedia page into CSV format.\n\n## Why?\n\n... TODO fille this out ...\n\nRead more about the initial project in [the blog post][blog-post].\n\n## Installation\n\nFrom [PyPI](https://pypi.org/project/wikitablescrape/) using [Python 3](https://www.python.org/downloads/)\n\n### As a user-level package\n\n```sh\npython3 -m pip install --user wikitablescrape\nwikitablescrape --help\n```\n\n### In a virtual environment\n\n```sh\npython3 -m venv venv\n. venv/bin/activate\n\n# From pip\npip install wikitablescrape\nwikitablescrape --help\n\n# From source\npython setup.py install\nwikitablescrape --help\n```\n\n## Usage\n\n```sh\n# Find a single HTML table and write as CSV to stdout\npython -m wikitablescrape --url=\"https://en.wikipedia.org/wiki/List_of_mountains_by_elevation\" --header=\"8000 metres\" | head -5\n# \"Mountain\",\"Metres\",\"Feet\",\"Range\",\"Location and Notes\"\n# \"Mount Everest\",\"8,848\",\"29,029\",\"Himalayas\",\"Nepal/China\"\n# \"K2\",\"8,611\",\"28,251\",\"Karakoram\",\"Pakistan/China\"\n# \"Kangchenjunga\",\"8,586\",\"28,169\",\"Himalayas\",\"Nepal/India \u2013 Highest in India\"\n# \"Lhotse\",\"8,516\",\"27,940\",\"Himalayas\",\"Nepal/China \u2013 Climbers ascend Lhotse Face in climbing Everest\"\n\n# Download an entire page of CSV files into a folder\npython -m wikitablescrape --url=\"https://en.wikipedia.org/wiki/List_of_mountains_by_elevation\" --output-folder=\"/tmp/scrape\"\n```\n\n## Testing\n\n```sh\n# Run unit tests and code coverage checks\ncoverage run --source wikitablescrape -m unittest discover && coverage report --fail-under=80\n\n# (Optionally) See coverage data\ncoverage html && open htmlcov/index.html\n```\n\n## Sample Articles for Scraping\n\n- [Top 25 Articles this Month](https://en.wikipedia.org/wiki/Wikipedia:Top_25_Report)\n- [Top 100 Articles of All Time](https://en.wikipedia.org/wiki/Wikipedia:Multiyear_ranking_of_most_viewed_pages#Top-100_list)\n\n## Contributing\n\nIf you would like to contribute to this module, please open an issue or pull request.\n\n## More Information\n\nIf you'd like to read more about this module, please check out [my blog post][blog-post].\n\n[blog-post]: https://roche.io/2016/05/scrape-wikipedia-with-python\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/rocheio/wiki-table-scrape", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "wikitablescrape", "package_url": "https://pypi.org/project/wikitablescrape/", "platform": "", "project_url": "https://pypi.org/project/wikitablescrape/", "project_urls": { "Homepage": "https://github.com/rocheio/wiki-table-scrape" }, "release_url": "https://pypi.org/project/wikitablescrape/1.0.2/", "requires_dist": [ "beautifulsoup4 (==4.*)", "lxml (==4.*)", "requests (==2.*)" ], "requires_python": "", "summary": "Scrape HTML tables from a Wikipedia page into CSV format.", "version": "1.0.2" }, "last_serial": 5796509, "releases": { "1.0.2": [ { "comment_text": "", "digests": { "md5": "53f131428088857a3fdef0ce7fdf581d", "sha256": "4b158d21d6c640c7da247c604d09c88b54820a3424f969583e69abf43880955e" }, "downloads": -1, "filename": "wikitablescrape-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "53f131428088857a3fdef0ce7fdf581d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9078, "upload_time": "2019-09-07T15:40:52", "url": "https://files.pythonhosted.org/packages/d0/22/e2330078775be99eb7d42e84d96e788b9c51af45f82f3761cc6b998733b7/wikitablescrape-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "fdda0e8fea660928d6658f7c24ca48ea", "sha256": "5379beb358fe04702d38d4d245bebebd2b504ab9c358ee597a911c060ac0834e" }, "downloads": -1, "filename": "wikitablescrape-1.0.2.tar.gz", "has_sig": false, "md5_digest": "fdda0e8fea660928d6658f7c24ca48ea", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6884, "upload_time": "2019-09-07T15:40:54", "url": "https://files.pythonhosted.org/packages/64/04/b6fb81a8d28d10a47dad9b118a909884406b0bcef6bc9fb216212e5cff64/wikitablescrape-1.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "53f131428088857a3fdef0ce7fdf581d", "sha256": "4b158d21d6c640c7da247c604d09c88b54820a3424f969583e69abf43880955e" }, "downloads": -1, "filename": "wikitablescrape-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "53f131428088857a3fdef0ce7fdf581d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9078, "upload_time": "2019-09-07T15:40:52", "url": "https://files.pythonhosted.org/packages/d0/22/e2330078775be99eb7d42e84d96e788b9c51af45f82f3761cc6b998733b7/wikitablescrape-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "fdda0e8fea660928d6658f7c24ca48ea", "sha256": "5379beb358fe04702d38d4d245bebebd2b504ab9c358ee597a911c060ac0834e" }, "downloads": -1, "filename": "wikitablescrape-1.0.2.tar.gz", "has_sig": false, "md5_digest": "fdda0e8fea660928d6658f7c24ca48ea", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6884, "upload_time": "2019-09-07T15:40:54", "url": "https://files.pythonhosted.org/packages/64/04/b6fb81a8d28d10a47dad9b118a909884406b0bcef6bc9fb216212e5cff64/wikitablescrape-1.0.2.tar.gz" } ] }