{ "info": { "author": "Rub\u00e9n H. Garc\u00eda", "author_email": "raiben@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "[![Project status](https://travis-ci.com/raiben/tropescraper.svg?branch=master)](https://travis-ci.com/raiben/tropescraper)\n[![made-with-python](https://img.shields.io/badge/Made%20with-Python-1f425f.svg)](https://www.python.org/)\n[![GitHub license](https://img.shields.io/github/license/raiben/tropescraper.svg)](https://github.com/raiben/tropescraper/blob/master/LICENSE)\n[![Github all releases](https://img.shields.io/github/downloads/raiben/tropescraper/total.svg)](https://GitHub.com/raiben/tropescraper/releases/)\n[![GitHub issues](https://img.shields.io/github/issues/raiben/tropescraper.svg)](https://GitHub.com/Naereen/raiben/tropescraper/)\n[![PyPI version](https://badge.fury.io/py/tropescraper.svg)](https://badge.fury.io/py/tropescraper)\n\n# tropescraper\n\nA tool to scrape all the films and their tropes \nfrom the website TV Tropes.\n\n## Requirements\n\nThis tool uses python >= 3.6 \n\n## Install as library\n\nIf you want to use this tool, you don't need to download the sources\nor clone the repository. Just install the latest version through PyPi:\n```\npip install tropescraper\n```\nThis will create an executable called `scrape-tvtropes`\n\n## Running the executable\n\nTry to execute\n```\nscrape-tvtropes\n```\n\nThe script can take some hours to finish, but don't worry, \nit can be stopped at any moment because it relies in file cache, so\nit will not try to re-download the same page twice even in \ndifferent executions.\n\nThe script will create a folder in the same directory called `scraper_cache`\nwith thousand of small compressed files. \n\nWhen the script finishes, you will also see a file called `tvtropes.json`\nwith a JSON content in the following format:\n\n```json\n{\n \"film_identifier\":[\n \"trope_identifier\", \n \"trope_identifier\"\n ],\n ...\n}, \n```\n\n### The log\n\n\nThe output should look like this:\n```log\nINFO:tropescraper.tvtropes_scraper:Process started\n* Remember that you can stop and restart at any time.\n** Please, remove manually the cache folder when you are done\n\nINFO:tropescraper.tvtropes_scraper:Scraping film ids...\nINFO:tropescraper.adaptors.file_cache:Building cache directory: ./scraper_cache\nINFO:tropescraper.adaptors.file_cache:Cache miss for https://tvtropes.org/pmwiki/pmwiki.php/Main/Film\nINFO:tropescraper.adaptors.file_cache:Cache set for https://tvtropes.org/pmwiki/pmwiki.php/Main/Film\nINFO:tropescraper.adaptors.file_cache:Cache miss for https://tvtropes.org/pmwiki/pmwiki.php/Main/Tropes\nINFO:tropescraper.adaptors.file_cache:Cache set for https://tvtropes.org/pmwiki/pmwiki.php/Main/Tropes\n...\n...\nINFO:tropescraper.adaptors.file_cache:Cache set for https://tvtropes.org/pmwiki/pmwiki.php/Film/ZyzzyxRoad\nINFO:tropescraper.tvtropes_scraper:Saved dictionary -> [] as JSON file tvtropes.json\nINFO:tropescraper.tvtropes_scraper:Summary:\n- Films: 12147\n- Tropes: 26479\n- Cache: CacheInformation(size=179513467, files_count=12290, \n creation_date=datetime.datetime(2019, 9, 15, 14, 59, 2))\n```\n\nThe tool is verbose so you can know the progress and estimate the\nduration of the process. You will be able to see a message\nin the form `Status: {film_index}/{total} films scraped`.\n\nApart from this, ths output is not really interesting unless\nanything goes bad :-) and it can be ignored.\n\n\n## Work with the original code\n\nIf you are a contributor and you want to work and improve the code\nyou only need clone the project and install all the dependencies with:\n\n```bash\ngit clone https://github.com/raiben/tropescraper.git\ncd tropescraper\npip install -r requirements.txt\n```\n\nTo build the module, \n\n1. Don't forget to modify the version in setup.py\naccording to the standard [Semantic Versioning](https://semver.org/).\n\n2. Then, just run in the project folder:\n```bash\npython3 setup.py sdist bdist_wheel\npip install dist/tropescraper-.tar.gz\n```\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/raiben/tropescraper", "keywords": "", "license": "LGPL", "maintainer": "", "maintainer_email": "", "name": "tropescraper", "package_url": "https://pypi.org/project/tropescraper/", "platform": "", "project_url": "https://pypi.org/project/tropescraper/", "project_urls": { "Homepage": "https://github.com/raiben/tropescraper" }, "release_url": "https://pypi.org/project/tropescraper/1.0.2/", "requires_dist": [ "requests", "lxml", "cssselect" ], "requires_python": "", "summary": "A TvTropes scrapper", "version": "1.0.2" }, "last_serial": 5832465, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "1c56422792588807a6b5c9e0264c4b28", "sha256": "9e9c8e22155cef75c4533cd13837e2f7f399d3531ff31a1e98ebabfe2522ec00" }, "downloads": -1, "filename": "tropescraper-0.1.tar.gz", "has_sig": false, "md5_digest": "1c56422792588807a6b5c9e0264c4b28", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3457, "upload_time": "2019-06-01T19:49:29", "url": "https://files.pythonhosted.org/packages/16/56/03cd46cf98ed88c08a8eef7abd6c5fab1661c73295575bfc9c9709c5d217/tropescraper-0.1.tar.gz" } ], "1.0": [ { "comment_text": "", "digests": { "md5": "25d2a0de408b0460b9201819bc1708a5", "sha256": "627464c9cad85cc6a0cacef2ea789c2611a33aaac089d7df7688473b26dafd6c" }, "downloads": -1, "filename": "tropescraper-1.0.tar.gz", "has_sig": false, "md5_digest": "25d2a0de408b0460b9201819bc1708a5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2701, "upload_time": "2019-09-15T11:44:19", "url": "https://files.pythonhosted.org/packages/22/e3/27c37036a3ec36ac5506db5097eb1b03dbd22e77e35f41451b18d42b3bd4/tropescraper-1.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "7c14908799f31e58318ace7414b99773", "sha256": "90c7e329bc4eed074b5c3394c2dbd39a1c142cc5aa5cefe2beb9b9d65f1e3cea" }, "downloads": -1, "filename": "tropescraper-1.0.1.tar.gz", "has_sig": false, "md5_digest": "7c14908799f31e58318ace7414b99773", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5919, "upload_time": "2019-09-15T13:19:37", "url": "https://files.pythonhosted.org/packages/00/eb/5b85c549575e9ec262656646da39ca0235a1a43c0db42582dd18658a2a64/tropescraper-1.0.1.tar.gz" } ], "1.0.2": [ { "comment_text": "", "digests": { "md5": "d72b344cfacd621d135cd02a61919d65", "sha256": "b3df9906c6886c89bb40800ae80710e6fe47fcf1ac29c2156853f53a87025745" }, "downloads": -1, "filename": "tropescraper-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "d72b344cfacd621d135cd02a61919d65", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 12462, "upload_time": "2019-09-15T16:26:36", "url": "https://files.pythonhosted.org/packages/7c/6a/3839d1467210176712ed19c890cf148c6b640772df75aa52ba8058058fa1/tropescraper-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "320a45a831f438cf1b020bde2cd75a18", "sha256": "97089e93792941873335d5c0315c6e2a32e2759172784da3c27ed0f99b161def" }, "downloads": -1, "filename": "tropescraper-1.0.2.tar.gz", "has_sig": false, "md5_digest": "320a45a831f438cf1b020bde2cd75a18", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7723, "upload_time": "2019-09-15T16:26:37", "url": "https://files.pythonhosted.org/packages/9c/e4/0fed600c05a6ed86325848156049d9e0522c5b995c7d7128d98666eb2650/tropescraper-1.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d72b344cfacd621d135cd02a61919d65", "sha256": "b3df9906c6886c89bb40800ae80710e6fe47fcf1ac29c2156853f53a87025745" }, "downloads": -1, "filename": "tropescraper-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "d72b344cfacd621d135cd02a61919d65", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 12462, "upload_time": "2019-09-15T16:26:36", "url": "https://files.pythonhosted.org/packages/7c/6a/3839d1467210176712ed19c890cf148c6b640772df75aa52ba8058058fa1/tropescraper-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "320a45a831f438cf1b020bde2cd75a18", "sha256": "97089e93792941873335d5c0315c6e2a32e2759172784da3c27ed0f99b161def" }, "downloads": -1, "filename": "tropescraper-1.0.2.tar.gz", "has_sig": false, "md5_digest": "320a45a831f438cf1b020bde2cd75a18", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7723, "upload_time": "2019-09-15T16:26:37", "url": "https://files.pythonhosted.org/packages/9c/e4/0fed600c05a6ed86325848156049d9e0522c5b995c7d7128d98666eb2650/tropescraper-1.0.2.tar.gz" } ] }