{ "info": { "author": "Valentin Pelloin", "author_email": "valentin.pelloin.etu@univ-lemans.fr", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# SVD2vec\n\nSVD2vec is a python library for representing documents words as vectors. Vectors are created using the PMI (Pointwise Mutual Information) and the SVD (Singular Value Decomposition).\n\nThis library implements recommendations from \"Improving Distributional Similarity with Lessons Learned from Word Embeddings\" (Omer Levy, Yoav Goldberg, and Ido Dagan). This papers suggests that traditional methods like PMI and SVD can be as good as word2vec by appling the same hyperparameters.\n\nDocumentation can be found at [https://valentinp72.github.io/svd2vec/index.html](https://valentinp72.github.io/svd2vec/index.html)\n\n### Example\n\n```shell\nwget http://mattmahoney.net/dc/text8.zip -O text8.gz\ngzip -d text8.gz -f\n```\n\n```python\n# Building\n>>> from svd2vec import svd2vec\n>>> documents = [open(\"text8\", \"r\").read().split(\" \")]\n>>> svd = svd2vec(documents, window=2, min_count=100)\n```\n\n```python\n# I/O\n>>> svd.save(\"svd.bin\")\n>>> svd = svd2vec.load(\"svd.bin\")\n```\n\n```python\n# Similarities\n>>> svd.similarity(\"bad\", \"good\")\n# 0.4156516999158368\n>>> svd.similarity(\"monday\", \"friday\")\n# 0.839529117681973\n```\n\n```python\n# Most similar words\n>>> svd.most_similar(positive=[\"january\"], topn=2)\n# [('february', 0.6854849518368631), ('october', 0.6653385092683669)]\n>>> svd.most_similar(positive=['moscow', 'france'], negative=['paris'], topn=4)\n# [('russia', 0.6221746629754187), ('ussr', 0.6024809889985986), ('soviet', 0.5794180517326273), ('bolsheviks', 0.5365123080505297)]\n```\n\n```python\n# Analogies\n>>> svd.analogy(\"paris\", \"france\", \"berlin\")\n# [('germany', 0.6977716641680641), ...]\n>>> svd.analogy(\"road\", \"cars\", \"rail\")\n# [('trains', 0.7532519174901262), ...]\n>>> svd.analogy(\"cow\", \"cows\", \"pig\")\n# [('pigs', 0.6944101149919422), ...]\n>>> svd.analogy(\"man\", \"men\", \"woman\")\n# [('women', 0.7471792753875327), ...]\n```\n\nUsing [Gensim](https://pypi.org/project/gensim/) you can load a `svd2vec` model using it's `word2vec` representation:\n```python\n>>> from gensim.models.keyedvectors import Word2VecKeyedVectors\n>>> svd.save_word2vec_format(\"svd_word2vec_format.txt\")\n>>> keyed_vector = Word2VecKeyedVectors.load_word2vec_format(\"svd_word2vec_format.txt\")\n>>> keyed_vector.similarity(\"good\", \"bad\")\n# 0.54922897\n```\n\n---\n\n[Improving Distributional Similarity with Lessons Learned from Word Embeddings](https://www.mitpressjournals.org/doi/abs/10.1162/tacl_a_00134)
\n**Omer Levy**, **Yoav Goldberg**, and **Ido Dagan**
\nTransactions of the Association for Computational Linguistics 2015 Vol. 3, 211-225
\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/valentinp72/svd2vec", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "svd2vec", "package_url": "https://pypi.org/project/svd2vec/", "platform": "", "project_url": "https://pypi.org/project/svd2vec/", "project_urls": { "Homepage": "https://github.com/valentinp72/svd2vec" }, "release_url": "https://pypi.org/project/svd2vec/0.3/", "requires_dist": [ "Pympler (==0.7)", "joblib (==0.11)", "numpy (==1.16.4)", "pandas (==0.24.2)", "scipy (==1.0.0)", "setuptools (==20.7.0)", "tqdm (==4.19.6)" ], "requires_python": "", "summary": "A library that converts words to vectors using PMI and SVD", "version": "0.3" }, "last_serial": 5471341, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "57861209989ae09b2672689ec57e14c5", "sha256": "21ad07ba77f839086521af58d7b23530e5958060b6a0f56c7a6f3f96d3ae4542" }, "downloads": -1, "filename": "svd2vec-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "57861209989ae09b2672689ec57e14c5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 172932, "upload_time": "2019-06-07T09:06:54", "url": "https://files.pythonhosted.org/packages/b7/8a/ace7cda208872df1a9f233f0a433cb3bfe426f23067651e898023949f7d3/svd2vec-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b8b04b33771a92366afa13089daad324", "sha256": "dbe9d74c687b4cbe82a1686f597a6d80fe0e155ed88929891795667be9da022d" }, "downloads": -1, "filename": "svd2vec-0.1.tar.gz", "has_sig": false, "md5_digest": "b8b04b33771a92366afa13089daad324", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 173115, "upload_time": "2019-06-07T09:06:57", "url": "https://files.pythonhosted.org/packages/a5/a5/b5745a9b99b03b33c10d93ebfc577596579214cbf951c1572da2acfbed11/svd2vec-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "1812d541240225f19cd299e8ca0853e6", "sha256": "d934f00e051e990345037b810c7cc2e47dec16756d8a5feff3e2781d0a2b8d92" }, "downloads": -1, "filename": "svd2vec-0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "1812d541240225f19cd299e8ca0853e6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 174211, "upload_time": "2019-06-07T13:18:59", "url": "https://files.pythonhosted.org/packages/a1/a5/04f7ffd0922cf87773242c3cca4ea70ac59bad131b6c103c3d239b4711ab/svd2vec-0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2565fd6b31dc5e0c609893be840e9592", "sha256": "06dbce82029af90c59bce75d202a71adf6e10a0debdfe67653ee71959c7faaa1" }, "downloads": -1, "filename": "svd2vec-0.2.tar.gz", "has_sig": false, "md5_digest": "2565fd6b31dc5e0c609893be840e9592", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 173488, "upload_time": "2019-06-07T13:19:02", "url": "https://files.pythonhosted.org/packages/60/65/5bd4b8069f74df7bd8aa741f69510e11cd4bebf9b2c2d3ee640713142d19/svd2vec-0.2.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "27e09b71a3ba788aab217ca4293f2d39", "sha256": "0f4a6ac1cf43496383d84a8b94be949422ba0a360f076329229376673d4f1724" }, "downloads": -1, "filename": "svd2vec-0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "27e09b71a3ba788aab217ca4293f2d39", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 174274, "upload_time": "2019-07-01T14:28:05", "url": "https://files.pythonhosted.org/packages/66/3b/16ef5ce30906b49edc04f327928a4caea86c0864f0de42591c38fb656af3/svd2vec-0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d6e88406e5d0ece8db9321e99cef0385", "sha256": "53bec4446dea9d0263e873bf04c124ebafea7fb87c182193162adbeedebe2eea" }, "downloads": -1, "filename": "svd2vec-0.3.tar.gz", "has_sig": false, "md5_digest": "d6e88406e5d0ece8db9321e99cef0385", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 173613, "upload_time": "2019-07-01T14:28:09", "url": "https://files.pythonhosted.org/packages/a6/6f/a6730fd9350587b8471a304a20d01b2173d70b883ebdfd418ee8309337f5/svd2vec-0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "27e09b71a3ba788aab217ca4293f2d39", "sha256": "0f4a6ac1cf43496383d84a8b94be949422ba0a360f076329229376673d4f1724" }, "downloads": -1, "filename": "svd2vec-0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "27e09b71a3ba788aab217ca4293f2d39", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 174274, "upload_time": "2019-07-01T14:28:05", "url": "https://files.pythonhosted.org/packages/66/3b/16ef5ce30906b49edc04f327928a4caea86c0864f0de42591c38fb656af3/svd2vec-0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d6e88406e5d0ece8db9321e99cef0385", "sha256": "53bec4446dea9d0263e873bf04c124ebafea7fb87c182193162adbeedebe2eea" }, "downloads": -1, "filename": "svd2vec-0.3.tar.gz", "has_sig": false, "md5_digest": "d6e88406e5d0ece8db9321e99cef0385", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 173613, "upload_time": "2019-07-01T14:28:09", "url": "https://files.pythonhosted.org/packages/a6/6f/a6730fd9350587b8471a304a20d01b2173d70b883ebdfd418ee8309337f5/svd2vec-0.3.tar.gz" } ] }