{ "info": { "author": "Henry Rosales", "author_email": "hrosmendez@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Education", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Software Development :: Build Tools" ], "description": "# Word Sense Disambiguation wrapper\n\nIn natural language processing **word sense disambiguation** (WSD) is the problem of determining which \"sense\" (meaning) of a word is activated by the use of the word in a particular context, a process which appears to be largely unconscious in people.\n\nThis is a simple library that wrap two WSD methods: NLTK and Babelfy. \n\n## Requirements\nYou should run \n```bash\npip3 install xmltodict\npip3 install nltk\npip3 install pywsd\n```\nThe NLTK library requires more extra configurations, see this [link](https://pythonprogramming.net/installing-nltk-nlp-python/) to more details.\n\n## Methods\nThe ```wsdNLTK``` methods call the function ```pywsd.disambiguate``` which returns a mapping between words of the input text and their WornNet Synsets. \n```python\nwsd = WrapperWSD()\nwsd.wsdNLTK(u'My sister has a dog. She loves him.')\n#output: [('sister', Synset('sister.n.02'), 3, 9), ('dog', Synset('pawl.n.01'), 16, 19), ('loves', Synset('sleep_together.v.01'), 25, 30)]\n```\n\nInstead of returning the WornNet Synsets, the method ```wsdNLTK_offset``` returns a mapping between words of the input text and their WornNet offset. \n\n```python\nwsd.wsdNLTK_offset(u'My sister has a dog. She loves him.')\n#output: [('president', 597265, 21, 30), ('USA', 8394922, 38, 41), ('best', 67379, 54, 58)]\n```\n\nA mapping between WordNet and Wikipedia was proposed in **[Miller et al]** available for download [here](https://www.informatik.tu-darmstadt.de/media/ukp/data/fileupload_2/lexical_resources/MillerGurevych2014_alignment.tar_1.zip). In the next example you can see some key-values of it.\n\n```python\nwd2wiki = {\n 1740: 'https://en.wikipedia.org/wiki/Madison_Square_Garden,_L.P.',\n 2137: 'https://en.wikipedia.org/wiki/Abstraction',\n 2452: 'https://en.wikipedia.org/wiki/Object_(philosophy)',\n 2684: 'https://en.wikipedia.org/wiki/Computer_file',\n 3553: 'https://en.wikipedia.org/wiki/Unit_of_alcohol',\n ...\n }\n```\n\nWe used this mapping to link entities from Wikipedia for those cases where exists a correspondence.\n\n```python\nwsd.wsdNLTK_links(u'My sister has a dog. She loves him.')\n#output: [{'start': 38, 'end': 41, 'label': 'USA', 'link': 'United_States_Army'}]\n```\n\nOn the other hand, we include Babelfy targetting BabelSynsets\n```python\nwsd.wsdBabelfy(u'My sister has a dog. She loves him.')\n#output: [('sister', 'bn:00071838n', 3, 9), ('dog', 'bn:00015267n', 16, 19), ('loves', 'bn:00090504v', 25, 30)]\n```\n\n\n## Combining the output with Entity Linking\n\nYou can use the [nifwrapper](https://github.com/henryrosalesmendez/nifwrapper) library in order to merge the WSD outputs with Entity Linking annotations.\n\n```python\nfrom wrapperWSD import WrapperWSD\nfrom nifwrapper import *\n\n\n#---- Obtaining disambiguation\nwsd = WrapperWSD()\ncorefWSD = wsd.wsdNLTK_links(u'My sister has a dog. She loves him.')\nprint(\"corefWSD:\",corefWSD)\n#output: [('sister', Synset('sister.n.02'), 3, 9), ('dog', Synset('pawl.n.01'), 16, 19), ('loves', Synset('sleep_together.v.01'), 25, 30)]\n\n\n#---- Obtaining Entity Linking results\n# inline NIF corpus creation\nwrp = NIFWrapper()\ndoc = NIFDocument(\"https://example.org/doc1\")\n#--\nsent = NIFSentence(\"https://example.org/doc1#char=0,19\")\nsent.addAttribute(\"nif:beginIndex\",\"0\",\"xsd:nonNegativeInteger\")\nsent.addAttribute(\"nif:endIndex\",\"19\",\"xsd:nonNegativeInteger\")\nsent.addAttribute(\"nif:isString\",\"My sister has a dog.\",\"xsd:string\")\nsent.addAttribute(\"nif:broaderContext\",[\"https://example.org/doc1\"],\"URI LIST\")\n\n\n#-- \na1 = NIFAnnotation(\"https://example.org/doc1#char=3,9\", \"3\", \"9\", [\"https://en.wikipedia.org/wiki/Sibling\"], [\"dbo:FamilyRelations\"])\na1.addAttribute(\"nif:anchorOf\",\"sister\",\"xsd:string\")\nsent.pushAnnotation(a1)\ndoc.pushSentence(sent)\n\n#--\nsent2 = NIFSentence(\"https://example.org/doc1#char=21,35\")\nsent2.addAttribute(\"nif:isString\",\"She loves him.\",\"xsd:string\")\nsent2.addAttribute(\"nif:broaderContext\",[\"https://example.org/doc1\"],\"URI LIST\")\nsent2.addAttribute(\"nif:beginIndex\",\"21\",\"xsd:nonNegativeInteger\")\nsent2.addAttribute(\"nif:endIndex\",\"35\",\"xsd:nonNegativeInteger\")\ndoc.pushSentence(sent2)\n#--\nwrp.pushDocument(doc)\n\n#---- Combining EL annotations with coreferences \nwrp.extendsDocWithWSD(corefWSD, doc.uri)\nprint(wrp.toString())\n```\n\n\n\n\n## Reference\n\n**[Miller et al]** *WordNet\u2013Wikipedia\u2013Wiktionary: Construction of a Three-way Alignment*. Tristan Miller and Iryna Gurevych. 2014 [https://pdfs.semanticscholar.org/90cd/22a9cd59dc1fc21f4ec36e9c7d95085f7fb6.pdf](https://pdfs.semanticscholar.org/90cd/22a9cd59dc1fc21f4ec36e9c7d95085f7fb6.pdf)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "Word Sense Disambiguation,NLP", "license": "", "maintainer": "", "maintainer_email": "", "name": "wrapperWSD", "package_url": "https://pypi.org/project/wrapperWSD/", "platform": "", "project_url": "https://pypi.org/project/wrapperWSD/", "project_urls": null, "release_url": "https://pypi.org/project/wrapperWSD/0.0.3/", "requires_dist": [ "check-manifest ; extra == 'dev'", "coverage ; extra == 'test'" ], "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "summary": "Word Sense Disambiguation wrapper", "version": "0.0.3" }, "last_serial": 6001783, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "5f808ae324cd6181581b4bfa72fafa39", "sha256": "583eb554f19ed2da1224ad5dea93010e3246af95a86df9f7ed90ea7c56deda96" }, "downloads": -1, "filename": "wrapperWSD-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "5f808ae324cd6181581b4bfa72fafa39", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 2608, "upload_time": "2019-10-19T03:40:37", "url": "https://files.pythonhosted.org/packages/1c/fb/aebac468f558b75bd4731aaf8ed520b913e3ca940aeb30a918e65193b6e2/wrapperWSD-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a53c472f177cd2496ecfb5a0e1c8fbb2", "sha256": "c7135616ed20793168cb806fb559ce45ea4e246af50a26cfca064431eb7848ab" }, "downloads": -1, "filename": "wrapperWSD-0.0.1.tar.gz", "has_sig": false, "md5_digest": "a53c472f177cd2496ecfb5a0e1c8fbb2", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 5778, "upload_time": "2019-10-19T03:40:40", "url": "https://files.pythonhosted.org/packages/9e/58/f3c7381451b292cbe60a4a8a5c47d650e59ab6c6e71ebdba0f5a6c1d9db9/wrapperWSD-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "32c40491e3cada046679ed538a8be4b3", "sha256": "2be82c710270bd7f3d3bf1ea40f1121bc2f70f4d893b49311e22b804fc91a803" }, "downloads": -1, "filename": "wrapperWSD-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "32c40491e3cada046679ed538a8be4b3", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 479050, "upload_time": "2019-10-19T03:51:12", "url": "https://files.pythonhosted.org/packages/3c/d4/e8cef683200d0bf6a21bf6d76cccfe4e70bd00a05a37e74caa9047e9e4a6/wrapperWSD-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8630313b0515efb07a08b6f7b3066778", "sha256": "fca180a1c7535a427323424cf80ffbaedebfc2344390f0771170dc27ab4266a1" }, "downloads": -1, "filename": "wrapperWSD-0.0.2.tar.gz", "has_sig": false, "md5_digest": "8630313b0515efb07a08b6f7b3066778", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 473661, "upload_time": "2019-10-19T03:51:15", "url": "https://files.pythonhosted.org/packages/55/3a/7aa99660eba7da760e3c5f5c17a2d064fa09fa9f67f2480ce52cfd2d677c/wrapperWSD-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "df0d5f33f3a33dbf7fdfa1db0d48ab0c", "sha256": "041bc2afd15f13b7ac923245e4ec2964a6995de15682b744d3deb4488e1de978" }, "downloads": -1, "filename": "wrapperWSD-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "df0d5f33f3a33dbf7fdfa1db0d48ab0c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 479575, "upload_time": "2019-10-20T03:25:42", "url": "https://files.pythonhosted.org/packages/97/6c/11106949bc596518f3d5970e1475565c8643dcde1427d1b593fe30260b52/wrapperWSD-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ab6ffb9ec2d6dd89cf7d2fd36fc212eb", "sha256": "1b3af7f83aa1f93916fed722a10476880e6c220deae270f6f9e1bacab3bc5a0b" }, "downloads": -1, "filename": "wrapperWSD-0.0.3.tar.gz", "has_sig": false, "md5_digest": "ab6ffb9ec2d6dd89cf7d2fd36fc212eb", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 474820, "upload_time": "2019-10-20T03:25:46", "url": "https://files.pythonhosted.org/packages/a5/07/90f1926d78a4ac5740b726bcaf6e931adfbbc45167db4301d11ced0b5158/wrapperWSD-0.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "df0d5f33f3a33dbf7fdfa1db0d48ab0c", "sha256": "041bc2afd15f13b7ac923245e4ec2964a6995de15682b744d3deb4488e1de978" }, "downloads": -1, "filename": "wrapperWSD-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "df0d5f33f3a33dbf7fdfa1db0d48ab0c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 479575, "upload_time": "2019-10-20T03:25:42", "url": "https://files.pythonhosted.org/packages/97/6c/11106949bc596518f3d5970e1475565c8643dcde1427d1b593fe30260b52/wrapperWSD-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ab6ffb9ec2d6dd89cf7d2fd36fc212eb", "sha256": "1b3af7f83aa1f93916fed722a10476880e6c220deae270f6f9e1bacab3bc5a0b" }, "downloads": -1, "filename": "wrapperWSD-0.0.3.tar.gz", "has_sig": false, "md5_digest": "ab6ffb9ec2d6dd89cf7d2fd36fc212eb", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 474820, "upload_time": "2019-10-20T03:25:46", "url": "https://files.pythonhosted.org/packages/a5/07/90f1926d78a4ac5740b726bcaf6e931adfbbc45167db4301d11ced0b5158/wrapperWSD-0.0.3.tar.gz" } ] }