{ "info": { "author": "Benjamin Fox", "author_email": "foxbenjaminfox@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Programming Language :: Python :: 3" ], "description": "# Semantic String Similarity CLI\n\n`simil` is a CLI interface to [`spacy`](https://spacy.io)'s string similarity engine. It uses the `en_vectors_web_lg` dataset to compare strings for their English semantic similarity. Given two words, phrases, or sentences, `simil` will tell you how similar their meanings are.\n\n## Installation\n\nFirst install `simil` itself:\n```bash\n$ pip3 install --user -U simil\n```\n\nNow install one of spacy's web_vector models:\n\n```\n$ python3 -m spacy download en_vectors_web_lg\n```\n\nYou can choose between `en_vectors_web_lg`, `en_core_web_lg`, and `en_core_web_md`, (`en_core_web_sm` don't include word vectors at all, and can't be used with `simil`.) `simil` will use the largest model that you have installed, with preference for the `vectors` model over a `core` model.\n\nI suggest using the large vectors model (`en_vectors_web_lg`), but you might want to use a smaller model in order to save on disk space or memory usage.\n\n## Usage:\n```bash\n$ sim first_file.txt second_file.txt # compare two files\n$ sim -s \"first string\" \"second string\" # compare two strings\n```\n\nThe output is a number between 0 and 1, representing how similar the two strings are.\n\n## Details:\n\n`simil` uses Spacy's word vector models trained with [`GLoVe`](https://nlp.stanford.edu/projects/glove/), such as [`en_vectors_web_lg`](https://spacy.io/models/en#en_vectors_web_lg).\n\nThis can be a large dataset, which makes for long startup times. So `simil` spins off a process in the background to hold the model, and works under a client-server model with it. This means that if you run `simil` a number of times in a row, only the first run is slow.\n\nThis background process does take up a fair bit of memory, typically around 2GB (for the `en_vectors_web_lg` model). After 10 minutes of inactivity it will automatically be killed, in order not to take up memory indefinitely. You can change the length of this timeout with the `--timeout` flag.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/foxbenjaminfox/string-similarity-cli", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "simil", "package_url": "https://pypi.org/project/simil/", "platform": "", "project_url": "https://pypi.org/project/simil/", "project_urls": { "Homepage": "https://github.com/foxbenjaminfox/string-similarity-cli" }, "release_url": "https://pypi.org/project/simil/0.0.2/", "requires_dist": [ "rpyc", "spacy" ], "requires_python": "", "summary": "CLI for semantic string similarity", "version": "0.0.2" }, "last_serial": 5577559, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "4425767b574b5c23aecd30faa99b9855", "sha256": "a669ec1ee83d7d582b701685d0d6cc905e215ddc519b6b21c770343e4b21638b" }, "downloads": -1, "filename": "simil-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "4425767b574b5c23aecd30faa99b9855", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 17885, "upload_time": "2019-07-24T13:17:03", "url": "https://files.pythonhosted.org/packages/de/f5/55385babcb872c97c4a42f4231d422a60c9d4f8d4fbad65e8248f0032d7a/simil-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9926a3ced34f2ea0f61d46b00e93af59", "sha256": "f001b1ed3332904252ca0a51b08238fa38e79f6722550cf13ff145813d4f53b6" }, "downloads": -1, "filename": "simil-0.0.2.tar.gz", "has_sig": false, "md5_digest": "9926a3ced34f2ea0f61d46b00e93af59", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4396, "upload_time": "2019-07-24T13:17:05", "url": "https://files.pythonhosted.org/packages/4a/9e/b058d2c36d06710884ef25242b556a6cd66012ecc3c7173856e9723ed3e7/simil-0.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4425767b574b5c23aecd30faa99b9855", "sha256": "a669ec1ee83d7d582b701685d0d6cc905e215ddc519b6b21c770343e4b21638b" }, "downloads": -1, "filename": "simil-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "4425767b574b5c23aecd30faa99b9855", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 17885, "upload_time": "2019-07-24T13:17:03", "url": "https://files.pythonhosted.org/packages/de/f5/55385babcb872c97c4a42f4231d422a60c9d4f8d4fbad65e8248f0032d7a/simil-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9926a3ced34f2ea0f61d46b00e93af59", "sha256": "f001b1ed3332904252ca0a51b08238fa38e79f6722550cf13ff145813d4f53b6" }, "downloads": -1, "filename": "simil-0.0.2.tar.gz", "has_sig": false, "md5_digest": "9926a3ced34f2ea0f61d46b00e93af59", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4396, "upload_time": "2019-07-24T13:17:05", "url": "https://files.pythonhosted.org/packages/4a/9e/b058d2c36d06710884ef25242b556a6cd66012ecc3c7173856e9723ed3e7/simil-0.0.2.tar.gz" } ] }