{ "info": { "author": "Dmitrijs Milajevs", "author_email": "dimazest@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Operating System :: MacOS :: MacOS X", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Text Processing :: Linguistic", "Topic :: Utilities" ], "description": "=========================\n Google ngram downloader\n=========================\n\n.. image:: https://travis-ci.org/dimazest/google-ngram-downloader.png?branch=master\n :target: https://travis-ci.org/dimazest/google-ngram-downloader\n\n.. image:: https://coveralls.io/repos/dimazest/google-ngram-downloader/badge.png?branch=master\n :target: https://coveralls.io/r/dimazest/google-ngram-downloader?branch=master\n\n.. image:: https://zenodo.org/badge/4321/dimazest/google-ngram-downloader.png\n :target: http://dx.doi.org/10.5281/zenodo.11884\n :alt: Zenodo doi.\n\n`The Google Books Ngram Viewer dataset`__ is a freely available resource under\na `Creative Commons Attribution 3.0 Unported License`__ which provides ngram\ncounts over books scanned by Google.\n\n__ http://storage.googleapis.com/books/ngrams/books/datasetsv2.html\n__ http://creativecommons.org/licenses/by/3.0/\n\nThe data is so big, that storing it is almost impossible. However, sometimes\nyou need an aggregate data over the dataset. For example to build a\nco-occurrence matrix.\n\nThis package provides an iterator over the dataset stored at Google. It\ndecompresses the data on the fly and provides you the access to the underlying\ndata.\n\nFeatures\n========\n\n* Download ngrams of various length and languages.\n* Access to part of ngrams, e.g. ones that start with an 'a'.\n\nInstallation\n============\n\n::\n\n pip install google-ngram-downloader\n\n\nThe command line tool\n=====================\n\nIt also provides a simple command line tool to download the ngrams called\n`google-ngram-downloader`. Refer to the help to see available actions::\n\n google-ngram-downloader help\n usage: google-ngram-downloader [options]\n\n commands:\n\n cooccurrence Write the cooccurrence frequencies of a word and its contexts.\n download Download The Google Books Ngram Viewer dataset version 20120701.\n help Show help for a given help topic or a help overview.\n readline Print the raw content.\n\n\nExample use of the API\n======================\n\n>>> from google_ngram_downloader import readline_google_store\n>>>\n>>> fname, url, records = next(readline_google_store(ngram_len=5))\n>>> fname\n'googlebooks-eng-all-5gram-20120701-0.gz'\n>>> url\n'http://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-5gram-20120701-0.gz'\n>>> next(records)\nRecord(ngram=u'0 \" A most useful', year=1860, match_count=1, volume_count=1)\n\nChanges\n=======\n\nVersion 4.0.1\n-------------\n\n* Citation information.\n* Tests are correctly packaged for a release.\n\nVersion 4.0.0\n-------------\n\n* Added 'indices' keyword. Thanks to neocortex.\n* Added 'language' flat. Thanks to Ray Powell (rpowellgit).\n\nVersion 3.1.1\n-------------\n\n* Non-unique contexts are taken into account inside of an ngram.\n\nVersion 3.1\n-----------\n\n* The ``cooccurrence`` command does not perform any ngram modification.\n\nVersion 3.0\n-----------\n\n* `download`, `readile` and `cooccurrence` subcommands.\n* `readline_google_store` transforms lines to `Record` in several processes.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/dimazest/google-ngram-downloader", "keywords": "", "license": "MIT License", "maintainer": null, "maintainer_email": null, "name": "google-ngram-downloader", "package_url": "https://pypi.org/project/google-ngram-downloader/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/google-ngram-downloader/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/dimazest/google-ngram-downloader" }, "release_url": "https://pypi.org/project/google-ngram-downloader/4.0.1/", "requires_dist": null, "requires_python": null, "summary": "The streaming access to the Google ngram data.", "version": "4.0.1" }, "last_serial": 2297485, "releases": { "1": [], "2": [ { "comment_text": "", "digests": { "md5": "0a84d060f92c060d39c4e072400fc2ab", "sha256": "fba846a81d5610173dba5c356325cffd2ced22464af4241a830f35f1d20dbaf4" }, "downloads": -1, "filename": "google-ngram-downloader-2.tar.gz", "has_sig": false, "md5_digest": "0a84d060f92c060d39c4e072400fc2ab", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7162, "upload_time": "2013-11-27T15:38:25", "url": "https://files.pythonhosted.org/packages/70/39/2308412d8e892eebd0d55f0d41fe8ee3eac5813d323a19a35ac0cf732ac2/google-ngram-downloader-2.tar.gz" } ], "3.0": [ { "comment_text": "", "digests": { "md5": "8dab4d647ad7cfd1c936a99d986d5a4a", "sha256": "59d06b5a456feb31da05a391b608d38c46fa9f29e6689f05fbdc7ef42ce81c65" }, "downloads": -1, "filename": "google-ngram-downloader-3.0.tar.gz", "has_sig": false, "md5_digest": "8dab4d647ad7cfd1c936a99d986d5a4a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8820, "upload_time": "2013-12-11T15:38:39", "url": "https://files.pythonhosted.org/packages/6d/35/3d65ae2aded17a5f8b0609327f1fe203b2a67f3f451b2e80b37b23cc6cbe/google-ngram-downloader-3.0.tar.gz" } ], "3.0.1": [ { "comment_text": "", "digests": { "md5": "f88c7e31d4070437436c0f7f06915372", "sha256": "5c5e57f28dfdec7007e4824635d516792b37d1b14c3d705ea10ffc31ca9fd155" }, "downloads": -1, "filename": "google-ngram-downloader-3.0.1.tar.gz", "has_sig": false, "md5_digest": "f88c7e31d4070437436c0f7f06915372", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8981, "upload_time": "2013-12-13T16:18:40", "url": "https://files.pythonhosted.org/packages/04/44/deee84a332627302b692f9dac4ab3b6f89a52e9e8577d0530fc2e1b1ae1c/google-ngram-downloader-3.0.1.tar.gz" } ], "3.1": [ { "comment_text": "", "digests": { "md5": "691e6fb552cf66e73d98ccf6b9aead9c", "sha256": "199d043c06cf8c6811fa92069a49a5f8692f7aa7fe996da69a38c41c43b9143a" }, "downloads": -1, "filename": "google-ngram-downloader-3.1.tar.gz", "has_sig": false, "md5_digest": "691e6fb552cf66e73d98ccf6b9aead9c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9026, "upload_time": "2013-12-17T15:48:55", "url": "https://files.pythonhosted.org/packages/e8/ef/6d962d8b3d26671ef083dfd71fa0469eea47d3eaa17dad454e6cdcd41095/google-ngram-downloader-3.1.tar.gz" } ], "3.1.1": [ { "comment_text": "", "digests": { "md5": "c0d0ae3c4c037c62a921ba7036bcab04", "sha256": "dc0302d15da93a96c5ebac173fe9e849da89af013b88a0806405d043a81ff2d5" }, "downloads": -1, "filename": "google-ngram-downloader-3.1.1.tar.gz", "has_sig": false, "md5_digest": "c0d0ae3c4c037c62a921ba7036bcab04", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9277, "upload_time": "2013-12-17T18:44:35", "url": "https://files.pythonhosted.org/packages/d8/cd/7d5fce664a4a94933d5544bba60443cbab3ae6e343dabc793fbcd43941e8/google-ngram-downloader-3.1.1.tar.gz" } ], "4.0.0": [ { "comment_text": "", "digests": { "md5": "c1c1fa6533b4bfbc9830267dfd6bf615", "sha256": "5c05298aea64ea19d3413b2b538a3af759089c0978da4861077351ee080bc869" }, "downloads": -1, "filename": "google-ngram-downloader-4.0.0.tar.gz", "has_sig": false, "md5_digest": "c1c1fa6533b4bfbc9830267dfd6bf615", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9960, "upload_time": "2014-09-23T12:22:39", "url": "https://files.pythonhosted.org/packages/9c/10/aacab80adbdecacc208662992d75f514454c94237ed2e496a4e976b1fe28/google-ngram-downloader-4.0.0.tar.gz" } ], "4.0.1": [ { "comment_text": "", "digests": { "md5": "6dd66872b17dbac341e5451c03ed40e1", "sha256": "c53520de654c1340a778f6f70f9e255a88894661c9ed20bcae6ee564f4f3c269" }, "downloads": -1, "filename": "google-ngram-downloader-4.0.1.tar.gz", "has_sig": false, "md5_digest": "6dd66872b17dbac341e5451c03ed40e1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12169, "upload_time": "2016-08-23T09:08:20", "url": "https://files.pythonhosted.org/packages/6e/18/06f89c493fbdf753163322e19a1680f67139526a7da7fd0cd506d5591ac5/google-ngram-downloader-4.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "6dd66872b17dbac341e5451c03ed40e1", "sha256": "c53520de654c1340a778f6f70f9e255a88894661c9ed20bcae6ee564f4f3c269" }, "downloads": -1, "filename": "google-ngram-downloader-4.0.1.tar.gz", "has_sig": false, "md5_digest": "6dd66872b17dbac341e5451c03ed40e1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12169, "upload_time": "2016-08-23T09:08:20", "url": "https://files.pythonhosted.org/packages/6e/18/06f89c493fbdf753163322e19a1680f67139526a7da7fd0cd506d5591ac5/google-ngram-downloader-4.0.1.tar.gz" } ] }