{ "info": { "author": "Vikash Singh", "author_email": "vikash.duliajan@gmail.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.5" ], "description": "This project has moved to `Flash Text `_. \n-----------------------------------------\n\nsynonym-extractor\n=================\n\nSynonym Extractor is a python library that is loosely based on `Aho-Corasick algorithm\n`_.\n\nThe idea is to extract words that we care about from a given sentence in one pass.\n\nBasically say I have a vocabulary of 10K words and I want to get all the words from that set present in a sentence. A simple regex match will take a lot of time to loop over the 10K documents.\n\nHence we use a simpler yet much faster algorithm to get the desired result.\n\nInstallation\n-------\n::\n\n pip install synonym-extractor\n\nUsage\n------\n::\n \n # import module\n from synonym.extractor import SynonymExtractor\n\n # Create an object of SynonymExtractor\n synonym_extractor = SynonymExtractor()\n\n # add synonyms\n synonym_names = ['NY', 'new-york', 'SF']\n clean_names = ['new york', 'new york', 'san francisco']\n\n for synonym_name, clean_name in zip(synonym_names, clean_names):\n synonym_extractor.add_to_synonym(synonym_name, clean_name)\n\n synonyms_found = synonym_extractor.get_synonyms_from_sentence('I love SF and NY. new-york is the best.')\n\n synonyms_found\n >> ['san francisco', 'new york', 'new york']\n\n\nAlgorithm\n----------\n\nsynonym-extractor is based on `Aho-Corasick algorithm\n`_.\n\nDocumentation\n----------\n\nDocumentation can be found at `Read the Docs\n`_.\n\n\nWhy\n------\n\n::\n\nSay you have a corpus where similar words appear frequently.\n\neg: Last weekened I was in NY.\n I am traveling to new york next weekend.\n\nIf you train a word2vec model on this or do any sort of NLP it will treat NY and new york as 2 different words. \n\nInstead if you create a synonym dictionary like:\n\neg: NY=>new york\n new york=>new york\n\nThen you can extract NY and new york as the same text.\n\nTo do the same with regex it will take a lot of time:\n\n============ ========== = ========= ============\nDocs count # Synonyms : Regex synonym-extractor\n============ ========== = ========= ============\n1.5 million 2K : 16 hours NA\n2.5 million 10K : 15 days 15 mins\n============ ========== = ========= ============\n\nThe idea for this library came from the following `StackOverflow question\n`_.\n\n\nLicense\n-------\n\nThe project is licensed under the MIT license.", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/vi3k6i5/synonym-extractor", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "synonym-extractor", "package_url": "https://pypi.org/project/synonym-extractor/", "platform": "any", "project_url": "https://pypi.org/project/synonym-extractor/", "project_urls": { "Homepage": "http://github.com/vi3k6i5/synonym-extractor" }, "release_url": "https://pypi.org/project/synonym-extractor/1.0/", "requires_dist": null, "requires_python": "", "summary": "Extract synonyms from sentences using Aho Corasick algorithm", "version": "1.0" }, "last_serial": 3103074, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "270ed25c697cab2410a18282d9f4f367", "sha256": "3fbe712ef0b2052225ded965d09f7bb8a6116a96a2abf9bb14cfa0ebd4b85373" }, "downloads": -1, "filename": "synonym_extractor-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "270ed25c697cab2410a18282d9f4f367", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 4307, "upload_time": "2017-07-02T12:50:49", "url": "https://files.pythonhosted.org/packages/34/e5/43fa7b0eca1661734129d818044955b48a042899790e1e7e5736dbd7932b/synonym_extractor-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "774aa6c001d669d05fc89e87c6ff2bb5", "sha256": "49d01db7482bbc8dde215789693dbc3cc5f97c9d0fe727abdd4e564d4ca8e14c" }, "downloads": -1, "filename": "synonym-extractor-0.1.0.tar.gz", "has_sig": false, "md5_digest": "774aa6c001d669d05fc89e87c6ff2bb5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3651, "upload_time": "2017-07-02T12:50:51", "url": "https://files.pythonhosted.org/packages/a8/11/4e830e824b22555456da6837fc12663bbe8b633e52bc0c0ae24aeb8d1c5b/synonym-extractor-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "a90b0a18b78b1ce866045d81bbc6a9a4", "sha256": "f82b1b7ba0c6d779f286abaf9aabd1175ee52bd112433bcded4c1c3a4e8e6a33" }, "downloads": -1, "filename": "synonym_extractor-0.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "a90b0a18b78b1ce866045d81bbc6a9a4", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 5660, "upload_time": "2017-07-09T06:24:07", "url": "https://files.pythonhosted.org/packages/d7/89/b20e62492ccf274768536d2c14c758668888a66bdca9ae622114f99252fe/synonym_extractor-0.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9f52a83ceb0e6cc2f216e38b0a0377ff", "sha256": "dd66e8b726e080ad2e5b0dcc5764b955b168ac4ddab9b2ecd314c17f1580c588" }, "downloads": -1, "filename": "synonym_extractor-0.1.1-py3.5.egg", "has_sig": false, "md5_digest": "9f52a83ceb0e6cc2f216e38b0a0377ff", "packagetype": "bdist_egg", "python_version": "3.5", "requires_python": null, "size": 5159, "upload_time": "2017-07-09T06:24:10", "url": "https://files.pythonhosted.org/packages/1c/80/92138385f8520d58377d299895011227cca6883344bc0dac64b5dc075e96/synonym_extractor-0.1.1-py3.5.egg" }, { "comment_text": "", "digests": { "md5": "9d290bc8eec522ad89fb7cf92d15c4ca", "sha256": "ebbc405f03453691e1c6022a6e797170e1b2eb35cf032025672958b357d47ad2" }, "downloads": -1, "filename": "synonym-extractor-0.1.1.tar.gz", "has_sig": false, "md5_digest": "9d290bc8eec522ad89fb7cf92d15c4ca", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4508, "upload_time": "2017-07-09T06:24:09", "url": "https://files.pythonhosted.org/packages/64/0e/6ab928be105b7410161f877f7906390088eb17e29ddc671f07f1d5d16d35/synonym-extractor-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "d8ae9f2f672370be8d327cecf5d4ab0d", "sha256": "3d94339e7dc068e53d55e408a8a2302285fda299e05fb56e506e12e2cdf3d05b" }, "downloads": -1, "filename": "synonym-extractor-0.1.2.tar.gz", "has_sig": false, "md5_digest": "d8ae9f2f672370be8d327cecf5d4ab0d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4482, "upload_time": "2017-07-10T12:16:34", "url": "https://files.pythonhosted.org/packages/72/f7/925bb73a283e880040cc91522a3d72a495ddd6eb0c2174b2e49672502890/synonym-extractor-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "7713779ed04ea3b2fda098af2358e92c", "sha256": "756820e57d588f6b5265d8f0897d10fa1dd9d6eca2d895dc7ca785e168031ece" }, "downloads": -1, "filename": "synonym-extractor-0.1.3.tar.gz", "has_sig": false, "md5_digest": "7713779ed04ea3b2fda098af2358e92c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4462, "upload_time": "2017-07-10T12:20:15", "url": "https://files.pythonhosted.org/packages/68/da/06c05c1e45e04e97e5e3d799de118ceaaded4be9ff68edcbcf664945ef7b/synonym-extractor-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "3cf838d5c34d0114ce39f616eecc92ec", "sha256": "e7acd823983807275db6370f31cdab33bd2068015a30484a07d69407154e67d3" }, "downloads": -1, "filename": "synonym-extractor-0.1.4.tar.gz", "has_sig": false, "md5_digest": "3cf838d5c34d0114ce39f616eecc92ec", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4490, "upload_time": "2017-07-21T10:08:47", "url": "https://files.pythonhosted.org/packages/3e/5e/06b43e76e8e1b2b58bf123a2a630df753a2f8684e1d81ab3d3ea006c6668/synonym-extractor-0.1.4.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "12469dc2858ade0b6936a5be49dc2e06", "sha256": "54e5102da4a0e9638dd34f47b2c33f6ebe854b587f0ce98de7ad9a76a5620ec4" }, "downloads": -1, "filename": "synonym-extractor-0.1.5.tar.gz", "has_sig": false, "md5_digest": "12469dc2858ade0b6936a5be49dc2e06", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4490, "upload_time": "2017-07-21T10:25:57", "url": "https://files.pythonhosted.org/packages/37/9b/aaf4c37f2c64c236f3a667ea7669ed8da999d4cf4ba27547e39f0890393b/synonym-extractor-0.1.5.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "75901421619dc452725da7ecd29130fc", "sha256": "b09bbfaabcc1c4422c64d58ccff386c3df1f1ffce957cbae38f78ff2ad29d952" }, "downloads": -1, "filename": "synonym-extractor-0.1.6.tar.gz", "has_sig": false, "md5_digest": "75901421619dc452725da7ecd29130fc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4674, "upload_time": "2017-07-21T11:11:26", "url": "https://files.pythonhosted.org/packages/db/00/6d0e1919b8b3ed9e892b1e68fc1e2c4a8902b5128274577b327859103f38/synonym-extractor-0.1.6.tar.gz" } ], "0.1.7": [ { "comment_text": "", "digests": { "md5": "09ad40cd27349861eef099d3f5483c30", "sha256": "8620271b063d55a2aaabc0e030e26ad51f5629937756dddf248471f32ed7f4be" }, "downloads": -1, "filename": "synonym-extractor-0.1.7.tar.gz", "has_sig": false, "md5_digest": "09ad40cd27349861eef099d3f5483c30", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5264, "upload_time": "2017-08-17T09:42:29", "url": "https://files.pythonhosted.org/packages/e3/18/6fafe8c5f01f89aae3f80b0175bf1753c3f06ebde20d85bde313370d27ca/synonym-extractor-0.1.7.tar.gz" } ], "0.1.8": [ { "comment_text": "", "digests": { "md5": "bd15ef9a6db28c5679be5b3f5752a0f8", "sha256": "f45e7044c90bbc2d6ff223ff569972d9e8a855b49ab2524d8f7ee3ac16876cab" }, "downloads": -1, "filename": "synonym-extractor-0.1.8.tar.gz", "has_sig": false, "md5_digest": "bd15ef9a6db28c5679be5b3f5752a0f8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5306, "upload_time": "2017-08-17T09:52:15", "url": "https://files.pythonhosted.org/packages/19/6f/d0c3a11892025b7038a9e1520a54716cfa44f046deb75ddc37fa380081a7/synonym-extractor-0.1.8.tar.gz" } ], "1.0": [ { "comment_text": "", "digests": { "md5": "697468f10ba47b88e77c6c60fc134cfe", "sha256": "d2c1f20af9bf4fb3f482a75ff119dc87392d3977245f8f51e5d43b39af0b3ee6" }, "downloads": -1, "filename": "synonym-extractor-1.0.tar.gz", "has_sig": false, "md5_digest": "697468f10ba47b88e77c6c60fc134cfe", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5313, "upload_time": "2017-08-17T10:00:00", "url": "https://files.pythonhosted.org/packages/66/a8/0ff0b0f25aeea5f5879586a216ccefc1f80fc1fb989ae84c144e5751ca6e/synonym-extractor-1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "697468f10ba47b88e77c6c60fc134cfe", "sha256": "d2c1f20af9bf4fb3f482a75ff119dc87392d3977245f8f51e5d43b39af0b3ee6" }, "downloads": -1, "filename": "synonym-extractor-1.0.tar.gz", "has_sig": false, "md5_digest": "697468f10ba47b88e77c6c60fc134cfe", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5313, "upload_time": "2017-08-17T10:00:00", "url": "https://files.pythonhosted.org/packages/66/a8/0ff0b0f25aeea5f5879586a216ccefc1f80fc1fb989ae84c144e5751ca6e/synonym-extractor-1.0.tar.gz" } ] }