{ "info": { "author": "Paul Landes", "author_email": "landes@mailc.net", "bugtrack_url": null, "classifiers": [], "description": "# Create an SQLite database from the Google ngrams database.\n\n[![Travis CI Build Status][travis-badge]][travis-link]\n[![PyPI][pypi-badge]][pypi-link]\n\nCreates an SQLite database of the one million [n-grams] datasets from Google.\nThis code downloads the [n-gram data sets] corpus and then creates an [SQLite]\ndatabase file with the contents. It also provides a simple API for [n-gram]\nlook ups.\n\n\n\n## Table of Contents\n\n- [Installation](#installation)\n- [Usage](#usage)\n - [Command Line](#command-line)\n - [Programmatic Interface](#programmatic-interface)\n- [Obtaining](#obtaining)\n- [Changelog](#changelog)\n- [License](#license)\n\n\n\n\n\n## Installation\n\n1. To make life easier, install [GNU Make]. If you do not, you'll need to\n follow the steps given in the [makefile](./makefile).\n2. Download the 1 million [n-gram data sets]: `make download`. This should\n take a few minutes with a good Internet connection.\n3. Create and load the [SQLite] database from the downloaded corpus: `make\n load`. Depending on the processor speed, this should take about an hour.\n4. Install from the command line either from source (`make install`) or from\n [pip](#obtaining).\n\nIf you want to use the program on the command line (as opposed to an API),\ncreate a the following file in `~/.ngramdbrc` with the contents:\n```ini\n[default]\n[ngram_db]\ndata_dir=${HOME}/view/nlp/ngramdb/data\n```\n\n## Usage\n\nThis project can be used either from the command line or as an API.\n\n\n### Command Line\n\nTo use from the command line:\n```bash\n% ngramdb query -g the -y 2005\n631362690 0.56880%\n```\nThis gives the number of unigrams (assuming unigrams were built) found since\n2005.\n\n\n### Programmatic Interface\n\nAs in the [installation](#installation) section, create the `~/.ngramdbrc`\nconfiguration file. Also note that the API is configured to easily work with\nother Python projects that use the [zensols.actioncli] configuration API.\n\n```python\nfrom zensols.ngramdb import AppConfig, Query\nconf = AppConfig.instance().app_config\nquery = Query(conf)\nstash = query.stash\nn_occurs = stash['The']\nprint(f'{n_occurs} {100 * n_occurs / len(stash):.5f}%')\n\n=> 631362690 0.56880%\n```\n\n\n## Obtaining\n\nThe easist way to install the command line program is via the `pip` installer:\n```bash\npip3 install zensols.ngramdb\n```\n\nBinaries are also available on [pypi].\n\n\n## Changelog\n\nAn extensive changelog is available [here](CHANGELOG.md).\n\n\n## License\n\nCopyright (c) 2019 Paul Landes\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the \"Software\"), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies\nof the Software, and to permit persons to whom the Software is furnished to do\nso, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n\n\n\n[travis-link]: https://travis-ci.org/plandes/ngramdb\n[travis-badge]: https://travis-ci.org/plandes/ngramdb.svg?branch=master\n[pypi]: https://pypi.org/project/zensols.ngramdb/\n[pypi-link]: https://pypi.python.org/pypi/zensols.ngramdb\n[pypi-badge]: https://img.shields.io/pypi/v/zensols.ngramdb.svg\n\n[n-gram data sets]: http://storage.googleapis.com/books/ngrams/books/datasetsv2.html\n[n-gram]: https://en.wikipedia.org/wiki/N-gram\n[GNU Make]: https://www.gnu.org/software/make/\n[zensols.actioncli]: https://github.com/plandes/actioncli\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/plandes/ngramdb/releases/download/v0.0.1/zensols.ngramdb-0.0.1-py3-none-any.whl", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/plandes/ngramdb", "keywords": "nlp", "license": "", "maintainer": "", "maintainer_email": "", "name": "zensols.ngramdb", "package_url": "https://pypi.org/project/zensols.ngramdb/", "platform": "", "project_url": "https://pypi.org/project/zensols.ngramdb/", "project_urls": { "Download": "https://github.com/plandes/ngramdb/releases/download/v0.0.1/zensols.ngramdb-0.0.1-py3-none-any.whl", "Homepage": "https://github.com/plandes/ngramdb" }, "release_url": "https://pypi.org/project/zensols.ngramdb/0.0.1/", "requires_dist": [ "zensols.actioncli (>=1.0.17)", "zensols.db (>=0.0.2)" ], "requires_python": "", "summary": "Creates an SQLite database ngrams.", "version": "0.0.1" }, "last_serial": 5616295, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "3b8ce7f25c30ddefa8df63c18be3b5dd", "sha256": "9b5152d26b53d4a6ad018e174f93dd31e203139af2e95c5558f4512a7f34a656" }, "downloads": -1, "filename": "zensols.ngramdb-0.0.1-py3.6.egg", "has_sig": false, "md5_digest": "3b8ce7f25c30ddefa8df63c18be3b5dd", "packagetype": "bdist_egg", "python_version": "3.6", "requires_python": null, "size": 17471, "upload_time": "2019-08-01T01:45:35", "url": "https://files.pythonhosted.org/packages/14/1e/cd53c7ee6a6dfc04c8725c556792de9563b5d62e5d8c55c61416b8cc6b4b/zensols.ngramdb-0.0.1-py3.6.egg" }, { "comment_text": "", "digests": { "md5": "6bb1b7fdf8edf12b2fcb87dc840587fb", "sha256": "d6156cdd187fed6f3894733765e3fb31a13ca8dbc331bde8aa8e892d0284e57c" }, "downloads": -1, "filename": "zensols.ngramdb-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "6bb1b7fdf8edf12b2fcb87dc840587fb", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10376, "upload_time": "2019-08-01T01:45:33", "url": "https://files.pythonhosted.org/packages/67/6a/2fddc036bf71b8599f04f56c6664d2c3766dccd5b0c9f9445fbeced8dc52/zensols.ngramdb-0.0.1-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "3b8ce7f25c30ddefa8df63c18be3b5dd", "sha256": "9b5152d26b53d4a6ad018e174f93dd31e203139af2e95c5558f4512a7f34a656" }, "downloads": -1, "filename": "zensols.ngramdb-0.0.1-py3.6.egg", "has_sig": false, "md5_digest": "3b8ce7f25c30ddefa8df63c18be3b5dd", "packagetype": "bdist_egg", "python_version": "3.6", "requires_python": null, "size": 17471, "upload_time": "2019-08-01T01:45:35", "url": "https://files.pythonhosted.org/packages/14/1e/cd53c7ee6a6dfc04c8725c556792de9563b5d62e5d8c55c61416b8cc6b4b/zensols.ngramdb-0.0.1-py3.6.egg" }, { "comment_text": "", "digests": { "md5": "6bb1b7fdf8edf12b2fcb87dc840587fb", "sha256": "d6156cdd187fed6f3894733765e3fb31a13ca8dbc331bde8aa8e892d0284e57c" }, "downloads": -1, "filename": "zensols.ngramdb-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "6bb1b7fdf8edf12b2fcb87dc840587fb", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10376, "upload_time": "2019-08-01T01:45:33", "url": "https://files.pythonhosted.org/packages/67/6a/2fddc036bf71b8599f04f56c6664d2c3766dccd5b0c9f9445fbeced8dc52/zensols.ngramdb-0.0.1-py3-none-any.whl" } ] }