{ "info": { "author": "Cyclododecene", "author_email": "terenceliu1012@outlook.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9" ], "description": "# NewsFeed\n[![py_version](https://img.shields.io/badge/python-3.7+-brightgreen)](https://www.python.org/)\n[![PyPI Version](https://img.shields.io/pypi/v/newsfeed.svg)](https://pypi.org/project/newsfeed)\n[![GDLET_version](https://img.shields.io/badge/GDELT-V1&V2-red)](https://gdeltproject.org)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n\n## Introduction\n\nNewsfeed based on GDELT Project\n\n### Installation\n\n```shell\nconda create -n newsfeed python=3.7\npython setup install\n```\n\n## GDELT API\n\nBased on the [gdelt-doc-api](https://github.com/alex9smith/gdelt-doc-api/), we consider a continuous querying mechanism by spliting the time range into multiple sub range (default setting is every 60 minutes).\n\n* FIPS 2 letter Contries list: please check: [LOOK-UP COUNTRIES](http://data.gdeltproject.org/api/v2/guides/LOOKUP-COUNTRIES.TXT)\n* GKG Themes list: please check: [LOOK-UP THEMES](http://data.gdeltproject.org/documentation/GKG-MASTER-THEMELIST.TXT)\n\nThe URL encoding reference: [url encode](https://www.eso.org/~ndelmott/url_encode.html)\n\n\n - [x] [GDELT DOC 2.0 API](https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/)\n - [x] [GDELT GEO 2.0 API](https://blog.gdeltproject.org/gdelt-geo-2-0-api-debuts) # BETA VERSION\n - [ ] [GDELT TV 2.0 API](https://blog.gdeltproject.org/gdelt-2-0-television-api-debuts/) # NOT YET\n\n\n## GDELT Database Query\n\n### GDELT 1.0\n\n - [x] [GDELT Events Database 1.0](http://data.gdeltproject.org/events/index.html)\n - [x] [GDELT Global Knowledge Graph 1.0](http://data.gdeltproject.org/gkg/index.html)\n\n### GDELT 2.0\n\n - [x] [GDELT Events Database 2.0](https://blog.gdeltproject.org/gdelt-2-0-our-global-world-in-realtime/)\n - [x] [GDELT Mentions Database (new)](https://blog.gdeltproject.org/gdelt-2-0-our-global-world-in-realtime/)\n - [x] [GDELT Global Knowledge Graph 2.0](https://blog.gdeltproject.org/gdelt-2-0-our-global-world-in-realtime/)\n\n### GDELT Others\n- [x] [GDELT Global Entity Graph](https://blog.gdeltproject.org/announcing-the-global-entity-graph-geg-and-a-new-11-billion-entity-dataset/)\n- [x] [GDELT Visual Global Entity Graph](https://blog.gdeltproject.org/what-googles-cloud-video-ai-sees-watching-decade-of-television-news-the-visual-global-entity-graph-2-0/)\n- [x] [GDELT Different Graph](https://blog.gdeltproject.org/announcing-the-gdelt-global-difference-graph-gdg-planetary-scale-change-detection-for-the-global-news-media/) \n- [x] [GDELT Global Frontpage Graph](https://blog.gdeltproject.org/announcing-gdelt-global-frontpage-graph-gfg/)\n\n## HOWTO\n\n### APIs\n\n#### For Article query:\n\n```python\nfrom newsfeed.news.apis.filters import * \nfrom newsfeed.news.apis.query import * \n\nf = Art_Filter(\n keyword = [\"Exchange Rate\", \"World\"],\n start_date = \"20211231000000\",\n end_date = \"20211231010000\",\n country = [\"China\", \"US\"]\n)\n\narticles_30 = article_search(query_filter = f, max_recursion_depth = 100, time_range = 30)\narticles_60 = article_search(query_filter = f, max_recursion_depth = 100, time_range = 60)\n```\n\n#### For Timeline query:\n\n```python\nfrom newsfeed.news.apis.filters import * \nfrom newsfeed.news.apis.query import * \n\nf = Art_Filter(\n keyword = [\"Exchange Rate\", \"World\"],\n start_date = \"2021-12-31-00-00-00\",\n end_date = \"2021-12-31-01-00-00\",\n country = [\"China\", \"US\"]\n)\ntimelineraw = timeline_search(query_filter = f, max_recursion_depth = 100, query_mode = \"timelinevolraw\")\n```\n\n#### For GEO query:\n\n```python\nfrom newsfeed.news.apis.filters import * \nfrom newsfeed.news.apis.query import * \n\nf = Art_Filter(\n keyword = [\"Exchange Rate\", \"World\"],\n country = [\"China\", \"US\"]\n)\ngeo_7d = geo_search(query_filter = f, sourcelang=\"english\", timespan=7)\n```\n\nquery_mode:\n* artlist: `article_search`\n* timeline: `timelinevol`, `timelinevolraw`, `timelinetone`, `timelinelang`, `timelinesourcecountry`\n\nmost of the parameters are the same with [gdelt-doc-api](https://github.com/alex9smith/gdelt-doc-api/), however, to specify the precise date range, we remove the `timespan` and use `start_date` and `time_range` for iteratively collecting articles.\n\n\n### Database Query\n\nFor event database (both V1 and V2):\n\n```python\nfrom newsfeed.news.db.events import *\n# GDELT Event Database Version 1.0\ngdelt_events_v1_events = EventV1(start_date = \"2021-01-01\", end_date = \"2021-01-02\")\nresults_v1_events = gdelt_events_v1_events.query()\nresults_v1_events_nowtime = gdelt_events_v1_events.query_nowtime()\n\n# GDELT Event Database Version 2.0 - Event\ngdelt_events_v2_events = EventV2(start_date = \"2021-01-01-00-00-00\", end_date = \"2021-01-02-00-00-00\")\nresults_v2_events = gdelt_events_v2_events.query()\nresults_v2_events_nowtime = gdelt_events_v2_events.query_nowtime()\n\n# GDELT Event Database Version 2.0 - Mentions\ngdelt_events_v2_mentions = EventV2(start_date = \"2021-01-01-00-00-00\", end_date = \"2021-01-02-00-00-00\", table = \"mentions\")\nresults_v2_mentions = gdelt_events_v2_mentions.query()\nresults_v2_mentions_nowtime = gdelt_events_v2_mentions.query_nowtime()\n\n```\n\nFor GKG databse (both V1 and V2):\n\n```python\nfrom newsfeed.news.db.gkg import *\n# GDELT GKG Database Version 1.0\ngdelt_events_v1_gkg = GKGV1(start_date = \"2021-01-01\", end_date = \"2021-01-02\")\nresults_v1_gkg = gdelt_events_v1_gkg.query()\nresults_v1_gkg_nowtime = gdelt_events_v1_gkg.query_nowtime()\n\nfrom newsfeed.news.db.gkg import *\n# GDELT GKG Database Version 2.0\ngdelt_events_v2_gkg = GKGV2(start_date = \"2021-01-01-00-00-00\", end_date = \"2021-01-02-00-00-00\")\nresults_v2_gkg = gdelt_events_v2_gkg.query()\nresults_v2_gkg_nowtime = gdelt_events_v2_gkg.query_nowtime()\n```\n\nFor GEG, VGEG and GDG:\n\n```python\nfrom newsfeed.news.db.others import *\n# GDELT Global Entity Graph\ngdelt_v3_geg = GEG(start_date = \"2020-01-01\", end_date = \"2020-01-02\")\ngdelt_v3_geg_result = gdelt_v3_geg.query()\n\n# GDELT Visual Global Entity Graph\ngdelt_v3_vgeg = VGEG(query_date = \"2020-01-01\", domain = \"CNN\")\ngdelt_v3_vgeg_result = gdelt_v3_vgeg.query() \n\n# GDELT Global Difference Graph\ngdelt_v3_gdg = GDG(query_date=\"2018-08-27-14-00-00\")\ngdelt_v3_gdg_result = gdelt_v3_gdg.query()\n\n# GDELT Global Frontpage Graph\ngdelt_v3_gfg = GFG(query_date=\"2018-03-02-02-00-00\")\ngdelt_v3_gfg_result = gdelt_v3_gfg.query()\n```\n\n### Utilities\n\nFull-text downloader (based on [`newspaper3k`](https://github.com/codelucas/newspaper) and [Wayback Machine](https://archive.org/help/wayback_api.php))\n\n```python\nfrom newsfeed.utils import fulltext as ft\nart = ft.download(url=\"https://english.news.cn/20220205/a4e93df9162e4053af64c392b5f5bfec/c.html\")\nprint(\"full text: \\n {}\".format(art.text))\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Cyclododecene/newsfeed", "keywords": "GDELT NewsFeed", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "newsfeed", "package_url": "https://pypi.org/project/newsfeed/", "platform": null, "project_url": "https://pypi.org/project/newsfeed/", "project_urls": { "Homepage": "https://github.com/Cyclododecene/newsfeed" }, "release_url": "https://pypi.org/project/newsfeed/0.1.3/", "requires_dist": [ "numpy (>=1.15.4)", "pandas (>=0.25)", "requests (>=2.22.0)", "tqdm (>=4.62.1)", "newspaper3k (>=0.2.5)", "lxml (>=4.4.2)" ], "requires_python": ">=3.6", "summary": "", "version": "0.1.3", "yanked": false, "yanked_reason": null }, "last_serial": 13646101, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "13e3628979e8bd5b646e57d53af9924a", "sha256": "a2aff2954c7ce20d43fb7727af26f05df831f4cd56868b87e6b363913ea78fa3" }, "downloads": -1, "filename": "newsfeed-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "13e3628979e8bd5b646e57d53af9924a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 29609, "upload_time": "2022-02-09T05:49:46", "upload_time_iso_8601": "2022-02-09T05:49:46.376428Z", "url": "https://files.pythonhosted.org/packages/9c/f7/a18018b027c705d53755409b3b2c45428f632b8b66d8b83f22b34742d056/newsfeed-0.1.0-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "b9dfc884ed5fc602f7c90722ae99adbe", "sha256": "260666c62f067dfe440aa209afb56028032d951b1d766c8837de18b7cdb8134c" }, "downloads": -1, "filename": "newsfeed-0.1.0.tar.gz", "has_sig": false, "md5_digest": "b9dfc884ed5fc602f7c90722ae99adbe", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 26999, "upload_time": "2022-02-09T05:49:47", "upload_time_iso_8601": "2022-02-09T05:49:47.573805Z", "url": "https://files.pythonhosted.org/packages/75/a4/347915d5cddc78d033f937a5cf051476fb252bdd86f13a4cd01a4f106fde/newsfeed-0.1.0.tar.gz", "yanked": false, "yanked_reason": null } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "5b07c01d1d0c12c4c140e02a7f8df71a", "sha256": "4d1724c1845c9d3051f89f572148a7dce12fb8628f7b710dae8ebad69bd6edd9" }, "downloads": -1, "filename": "newsfeed-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "5b07c01d1d0c12c4c140e02a7f8df71a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 29620, "upload_time": "2022-04-27T09:09:38", "upload_time_iso_8601": "2022-04-27T09:09:38.247244Z", "url": "https://files.pythonhosted.org/packages/d6/a3/c128a244ef640e436c43cdb4523360067a8c6c742d3ff6032eb87f0812c6/newsfeed-0.1.1-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "cb124286cbf5aa26674560cbc990eeea", "sha256": "3bb89ebf12fa1d40f20f412ed20ad4c99424cdf2423da0ba9e18189abcdb6dc4" }, "downloads": -1, "filename": "newsfeed-0.1.1.tar.gz", "has_sig": false, "md5_digest": "cb124286cbf5aa26674560cbc990eeea", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 27087, "upload_time": "2022-04-27T09:09:39", "upload_time_iso_8601": "2022-04-27T09:09:39.306257Z", "url": "https://files.pythonhosted.org/packages/06/a8/7f415f5fc0716b0b4aa6754330a9277bb6e868897f4b819c88db13788fc8/newsfeed-0.1.1.tar.gz", "yanked": false, "yanked_reason": null } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "82e9535bf52652e49e1e836a89d570eb", "sha256": "c2342ba610b54067da1aea9d0b7bd67dfcb05b79bed6ff72f459d77bcf151a0c" }, "downloads": -1, "filename": "newsfeed-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "82e9535bf52652e49e1e836a89d570eb", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 29604, "upload_time": "2022-04-27T09:17:13", "upload_time_iso_8601": "2022-04-27T09:17:13.124960Z", "url": "https://files.pythonhosted.org/packages/05/3f/ee7f2d69a296861a381de65ae8af026a25e0c7aff2c59f24ca25d43f10fb/newsfeed-0.1.2-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "28557e30bf33597629b5f83ed0886e44", "sha256": "875734696dc8793839eb029a6fa1851783fad349e6c21c201acf217b905ac607" }, "downloads": -1, "filename": "newsfeed-0.1.2.tar.gz", "has_sig": false, "md5_digest": "28557e30bf33597629b5f83ed0886e44", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 27085, "upload_time": "2022-04-27T09:17:14", "upload_time_iso_8601": "2022-04-27T09:17:14.763497Z", "url": "https://files.pythonhosted.org/packages/45/b6/bb7a6de5f8cb922f095626cc023d34654dd04a93bf0e7c55eaffc087a579/newsfeed-0.1.2.tar.gz", "yanked": false, "yanked_reason": null } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "6a724b54f6a0fb0bc9f4c7ace6a12a1d", "sha256": "87c566fb46a6db587358d49e8edb486ae789fdd2ebb115135fa4266789ec27c0" }, "downloads": -1, "filename": "newsfeed-0.1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "6a724b54f6a0fb0bc9f4c7ace6a12a1d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 29708, "upload_time": "2022-04-28T01:14:22", "upload_time_iso_8601": "2022-04-28T01:14:22.869618Z", "url": "https://files.pythonhosted.org/packages/a5/e6/f8bda5c89bf75bf2b36e1956a7d2c3be2cbf509276da96bc59b20d37ab50/newsfeed-0.1.3-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "a21b604e89d219136c03fd8c6506fd96", "sha256": "ab9fae6f48147325e26a41b6ec13a0c39212621fa905e97752dfcb2eb2a17612" }, "downloads": -1, "filename": "newsfeed-0.1.3.tar.gz", "has_sig": false, "md5_digest": "a21b604e89d219136c03fd8c6506fd96", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 27175, "upload_time": "2022-04-28T01:14:24", "upload_time_iso_8601": "2022-04-28T01:14:24.244931Z", "url": "https://files.pythonhosted.org/packages/6a/5f/cf58b0414e6c2a1cde798a1d6c16c9bad4552584c3932404f657c03d8799/newsfeed-0.1.3.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "6a724b54f6a0fb0bc9f4c7ace6a12a1d", "sha256": "87c566fb46a6db587358d49e8edb486ae789fdd2ebb115135fa4266789ec27c0" }, "downloads": -1, "filename": "newsfeed-0.1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "6a724b54f6a0fb0bc9f4c7ace6a12a1d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 29708, "upload_time": "2022-04-28T01:14:22", "upload_time_iso_8601": "2022-04-28T01:14:22.869618Z", "url": "https://files.pythonhosted.org/packages/a5/e6/f8bda5c89bf75bf2b36e1956a7d2c3be2cbf509276da96bc59b20d37ab50/newsfeed-0.1.3-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "a21b604e89d219136c03fd8c6506fd96", "sha256": "ab9fae6f48147325e26a41b6ec13a0c39212621fa905e97752dfcb2eb2a17612" }, "downloads": -1, "filename": "newsfeed-0.1.3.tar.gz", "has_sig": false, "md5_digest": "a21b604e89d219136c03fd8c6506fd96", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 27175, "upload_time": "2022-04-28T01:14:24", "upload_time_iso_8601": "2022-04-28T01:14:24.244931Z", "url": "https://files.pythonhosted.org/packages/6a/5f/cf58b0414e6c2a1cde798a1d6c16c9bad4552584c3932404f657c03d8799/newsfeed-0.1.3.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }