{ "info": { "author": "Joshua Hruzik", "author_email": "joshua.hruzik@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3", "Topic :: Internet :: WWW/HTTP" ], "description": "# pyreiseamt\nThe German Foreign Office publishes travel information for nearly all countries in the world. This includes information on security and medical matters. Information is up-to-date and allows to asses whether a certain country is safe to visit or not.\n\npyreiseamt is designed to crawl the information on https://www.auswaertiges-amt.de/de/ReiseUndSicherheit/reise-und-sicherheitshinweise and present the data in structured JSON format. Also, pyreiseamt can deliver sentiment analysis for every country and every category. This allows you to mine the available data more efficently.\n\n### Installation\nYou can install pyreiseamt via pip.\n\n```bash\npip install pyreiseamt\n```\n\nAfter installation has finished, you have the choice to use pyreiseamt as a CLI tool or input its scraper into one of your scripts.\n\n### Usage\n\npyreiseamt offers a unified point of entry for two basic tasks: *list* available countries and *extract* information on one or more countries.\n\n#### List Available Countries\nIf you want to get a list of available country names (or want to make sure that you use the correct one), you can use the *list* command. There are no additonal arguments needed for that:\n\n```bash\npyreiseamt list\n```\n\nAfter fetching the newest data from the Foreign Office's website, you will get a list of all available countries printed to your screen. There are four countries for every row and the single countries are seperated by ' | '.\n\n\n#### Extract Information\nAssume that you want to crawl information on all available countries. You can use the *extract* with the *-o* (output path) argument. *-o* should point to a json file where the results will be written to. Note that your output file should always end with '.json'\n\n```bash\npyreiseamt extract -o ~/all_countries.json\n```\n\nIf you want to limit the crawl job to certain countries, you can use the *-c* argument. A single string should list all countries you want to extract, seperated by a semicolon.\n\n```bash\npyreiseamt extract -o ~/select_countries.json -c \"Frankreich;Georgien;Griechenland\"\n```\n\nThis will limit the extraction to France, Georgia, and Greece.\n\nThere are two remaining options to the extract command. The presence of *-s* will calculate the sentiment for every top category for every country. Also, *-n* will make sure that the top category names are all consistent for every country. The last option is necessary due to the fact that a category might have a different name in one country than in the other (despite the same content). If you want to extract information on all countries but also include sentiment and consistent category names, you could use pyreiseamt like so:\n\n```bash\npyreiseamt extract -o ~/all_countries.json -s -n\n```\n\n####Use the Scraper in your own scripts\nIf you prefer to use the built-in crawler in your scripts, you can do so by importing the scraper from the package (assuming that you've installed the package).\n\n```python\nfrom pyreiseamt.scraper import extract_country\n\nurl = \"https://www.auswaertiges-amt.de/de/ReiseUndSicherheit/australiensicherheit/213920\"\n\naustralia = extract_country(url)\n```\n\naustralia will be a dictionary holding the relevant texts for every top and sub category.\n\n### Links\npyreiseamt is a crawler with a CLI to https://www.auswaertiges-amt.de/de/ReiseUndSicherheit/reise-und-sicherheitshinweise. However, right now there is only information on country specific security issues, general travel guidances, and medical conditions. You can use the website to extract more information if needed.\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Jhruzik/pyreiseamt", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "pyreiseamt", "package_url": "https://pypi.org/project/pyreiseamt/", "platform": "", "project_url": "https://pypi.org/project/pyreiseamt/", "project_urls": { "Homepage": "https://github.com/Jhruzik/pyreiseamt" }, "release_url": "https://pypi.org/project/pyreiseamt/0.0.1/", "requires_dist": [ "requests", "bs4", "lxml" ], "requires_python": "", "summary": "Package to crawl country information of German Foreign Office", "version": "0.0.1" }, "last_serial": 5437564, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "12906d9e2fe2cbc7ce308c23cab5aef6", "sha256": "a00088473e3ad3e0641346bb2a04220cfe9148c709cc44d72d1b5bf0cf4f1271" }, "downloads": -1, "filename": "pyreiseamt-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "12906d9e2fe2cbc7ce308c23cab5aef6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 212495, "upload_time": "2019-06-23T15:23:53", "url": "https://files.pythonhosted.org/packages/b4/1a/35d15dcefbf892e169c4f2a42145750dbe56055565dff45db65ab2e330ec/pyreiseamt-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "719ae600b5f8e715f58bde4ec1168063", "sha256": "b3e23f7f02ffcc9804c270e0d61e978e488cf5f5a26f97434d69655b690589be" }, "downloads": -1, "filename": "pyreiseamt-0.0.1.tar.gz", "has_sig": false, "md5_digest": "719ae600b5f8e715f58bde4ec1168063", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6724, "upload_time": "2019-06-23T15:23:56", "url": "https://files.pythonhosted.org/packages/48/10/c0838157b570ea4695c04a1403f246de1ba364e3e79d43f34814b7c12af9/pyreiseamt-0.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "12906d9e2fe2cbc7ce308c23cab5aef6", "sha256": "a00088473e3ad3e0641346bb2a04220cfe9148c709cc44d72d1b5bf0cf4f1271" }, "downloads": -1, "filename": "pyreiseamt-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "12906d9e2fe2cbc7ce308c23cab5aef6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 212495, "upload_time": "2019-06-23T15:23:53", "url": "https://files.pythonhosted.org/packages/b4/1a/35d15dcefbf892e169c4f2a42145750dbe56055565dff45db65ab2e330ec/pyreiseamt-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "719ae600b5f8e715f58bde4ec1168063", "sha256": "b3e23f7f02ffcc9804c270e0d61e978e488cf5f5a26f97434d69655b690589be" }, "downloads": -1, "filename": "pyreiseamt-0.0.1.tar.gz", "has_sig": false, "md5_digest": "719ae600b5f8e715f58bde4ec1168063", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6724, "upload_time": "2019-06-23T15:23:56", "url": "https://files.pythonhosted.org/packages/48/10/c0838157b570ea4695c04a1403f246de1ba364e3e79d43f34814b7c12af9/pyreiseamt-0.0.1.tar.gz" } ] }