{ "info": { "author": "Dan Shiebler", "author_email": "danshiebler@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# repcomp\n\n`repcomp` (short for representation comparison) is a package for comparing trained embedding models. You can use it to compare Deep Neural Networks, Matrix Factorization models, Graph Embeddings, Word Embeddings, etc.\n\n`repcomp` supports the following embedding comparison approaches:\n\n* Nearest Neighbors: Fetch the nearest neighbor set of each entity according to embedding distances, and compare model A's neighbor sets to model B's neighbor sets.\n* Canonical Correlation: Treat embedding components as observations of random variables and compute the canonical correlations between model A and model B. \n* Unit Match: Form a unit-to-unit matching between model A's embedding components and model B's embedding components and measure the correlations of the matched units.\n\nA simple example comparing random embeddings:\n\n```python\n from repcomp.comparison import CCAComparison\n import numpy as np\n\n # Generate random embedding matrices\n num_samples = 100\n num_components = 10\n embedding_1 = np.random.random((num_samples, num_components))\n embedding_2 = embedding_1 + 0.5 * np.random.random((num_samples, num_components))\n\n # Run the comparison\n comparator = CCAComparison()\n sim = comparator.run_comparison(embedding_1, embedding_2)\n print(\"The canonical correlation similarity is {}\".format(sim[\"similarity\"]))\n```\n\nA more involved example comparing word embeddings:\n\n```python\n import gensim.downloader as api\n import numpy as np\n from repcomp.comparison import NeighborsComparison\n\n # Load word vectors from gensim\n glove_wiki_50 = api.load(\"glove-wiki-gigaword-50\")\n glove_twitter_50 = api.load(\"glove-twitter-50\")\n\n # Build the embedding matrices over the shared vocabularies\n shared_vocab = set(glove_wiki_50.vocab.keys()).intersection(\n set(glove_twitter_50.vocab.keys()))\n glove_wiki_50_vectors = np.vstack([glove_wiki_50.get_vector(word) for word in shared_vocab])\n glove_twitter_50_vectors = np.vstack([glove_twitter_50.get_vector(word) for word in shared_vocab])\n\n # Run the comparison\n comparator = NeighborsComparison()\n print(\"The neighbors similarity between glove-wiki-gigaword-50 and glove-twitter-50 is {}\".format(\n comparator.run_comparison(glove_wiki_50_vectors, glove_twitter_50_vectors)[\"similarity\"]))\n```\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/pypa/sampleproject", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "repcomp", "package_url": "https://pypi.org/project/repcomp/", "platform": "", "project_url": "https://pypi.org/project/repcomp/", "project_urls": { "Homepage": "https://github.com/pypa/sampleproject" }, "release_url": "https://pypi.org/project/repcomp/0.1/", "requires_dist": [ "pandas (>=0.20.3)", "gensim (>=0.13.3)", "keras (>=2.0.1)", "annoy (>=1.12.0)", "mock (>=2.0.0)", "tqdm (>=4.19.5)", "scikit-learn (>=0.19.1)", "pytest (>=3.4.1)" ], "requires_python": "", "summary": "A package for comparing trained embedding models.", "version": "0.1" }, "last_serial": 4388591, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "f301b48f4aa2e11e60bb2c86144c1aa8", "sha256": "52184ecd2ffbdbef4587040a71d69ae1301caf8ab450ea357f6f96cfd4415808" }, "downloads": -1, "filename": "repcomp-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "f301b48f4aa2e11e60bb2c86144c1aa8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6079, "upload_time": "2018-10-18T02:16:46", "url": "https://files.pythonhosted.org/packages/04/e1/6045f77dfb72bdf320b66c0bbf16dc4f35490807c5a63b062de79190329e/repcomp-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d2178aa3ad61934d9e9ea01d483dff36", "sha256": "6d742e78699483b67ffdc7a82e000d076bab91ff956b65f64555dd1191c3356f" }, "downloads": -1, "filename": "repcomp-0.1.tar.gz", "has_sig": false, "md5_digest": "d2178aa3ad61934d9e9ea01d483dff36", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4263, "upload_time": "2018-10-18T02:16:46", "url": "https://files.pythonhosted.org/packages/e0/a8/8d89424ded043082ee90ef3cc7752d5f144f583f75ae17cd0a873d875a86/repcomp-0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f301b48f4aa2e11e60bb2c86144c1aa8", "sha256": "52184ecd2ffbdbef4587040a71d69ae1301caf8ab450ea357f6f96cfd4415808" }, "downloads": -1, "filename": "repcomp-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "f301b48f4aa2e11e60bb2c86144c1aa8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6079, "upload_time": "2018-10-18T02:16:46", "url": "https://files.pythonhosted.org/packages/04/e1/6045f77dfb72bdf320b66c0bbf16dc4f35490807c5a63b062de79190329e/repcomp-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d2178aa3ad61934d9e9ea01d483dff36", "sha256": "6d742e78699483b67ffdc7a82e000d076bab91ff956b65f64555dd1191c3356f" }, "downloads": -1, "filename": "repcomp-0.1.tar.gz", "has_sig": false, "md5_digest": "d2178aa3ad61934d9e9ea01d483dff36", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4263, "upload_time": "2018-10-18T02:16:46", "url": "https://files.pythonhosted.org/packages/e0/a8/8d89424ded043082ee90ef3cc7752d5f144f583f75ae17cd0a873d875a86/repcomp-0.1.tar.gz" } ] }