{ "info": { "author": "Dumitrescu Stefan and Andrei Marius Avram", "author_email": "dumitrescu.stefan@gmail.com, avramandrei9666@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Information Technology", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Human Machine Interfaces", "Topic :: Scientific/Engineering :: Information Analysis", "Topic :: Text Processing", "Topic :: Text Processing :: Filters", "Topic :: Text Processing :: General", "Topic :: Text Processing :: Indexing", "Topic :: Text Processing :: Linguistic" ], "description": "# RoWordNet\n\n**RoWordNet stand for Romanian WordNet, a semantic network for the Romanian language**. RoWordNet mimics Princeton WordNet, a large lexical database of English. \nThe building block of a WordNet is the **synset** that expresses a unique concept. The synset (a synonym set) contains, as the name implies, a number of synonym words known as literals. The synset has more properties like a definition and links to other synsets. They also have a part-of-speech (pos) that groups them in four categories: nouns, verbs, adverbs and adjectives. Synsets are interlinked by **semantic relations** like hypernymy (\"is-a\"), meronymy (\"is-part\"), antonymy, and others. \n\n## Install\n\nSimple, use python's pip:\n```sh\npip install rowordnet\n```\n\nRoWordNet has two dependencies: _networkx_ and _lxml_ which are automatically installed by pip.\n\n## Intro\n\nRoWordNet is, at its core, a directed graph (powered by networkx) with synset IDs as nodes and relations as edges. Synsets (objects) are kept as an ID:object indexed dictionary for O(1) access.\n\nA **synset** has the following data, accessed as properties (others are present, but the following are most important): \n* id : the id(string) of this synset\n* literals : a list of words(strings) representing a unique concept. These words are synonyms.\n* definition : a longer description(string) of this synset\n* pos : the part of speech of this synset (enum: Synset.Pos.NOUN, VERB, ADVERB, ADJECTIVE)\n* sentiwn : a three-valued list indicating the SentiWN PNO (Positive, Negative, Objective) of this synset.\n\n**Relations** are edges between synsets. Examples on how to list inbound/outbound relations given a synset and other graph operations are given in examples below.\n\n____\n\n* Demo on basic ops available as a [Jupyter notebook here](jupyter/basic_operations_wordnet.ipynb).\n* Demo on more advanced ops available as a [Jupyter notebook here](jupyter/create_edit_synsets.ipynb).\n* Demo on save/load ops available as a [Jupyter notebook here](jupyter/load_save_wordnet.ipynb).\n* Demo on synset/relation creation and editing available as a [Jupyter notebook here](jupyter/synonym_antonym.ipynb).\n____\n\n## Basic Usage\n\n```python\nimport rowordnet as rwn\nwn = rwn.RoWordNet()\n```\n\nAnd you're good to go. We present a few basic usage examples here:\n\n### Search for a word\n\nAs words are polysemous, searching for a word will likely yield more than one synset. A word is known as a literal in RoWordNet, and every synset has one or more literals that are synonyms.\n```python\nword = 'arbore'\nsynset_ids = wn.synsets(literal=word)\n```\nEash synset has a unique ID, and most operations work with IDs. Here, ``wn.synsets(word)`` returns a list of synsets that contain word 'arbore' or an empty list if the word is not found. \n\nPlease note that the Romanian WordNet also contains words (literals) that are actually expressions like \"tren\\_de\\_marf\u0103\", and searching for \"tren\" will also find this synset.\n\n### Get a synset\n\nCalling ``wn.print_synset(id)`` prints all available info of a particular synset.\n\n```python \nwn.print_synset(synset_id)\n```\n\nTo get the actual Synset object, we simply call ``wn.synset(id)``, or ``wn(id)`` directly.\n\n```python\nsynset_object = wn.synset(synset_id)\nsynset_object = wn(synset_id) # equivalent, shorter form\n```\n\nTo print any individual information, like its literals, definiton or ID, we directly call the synset object's properties:\n\n```python\nprint(\"Print its literals (synonym words): {}\".format(synset_object.literals))\nprint(\"Print its definition: {}\".format(synset_object.definition))\nprint(\"Print its ID: {}\".format(synset_object.id))\n```\n\n### Synsets access\n\nThe ``wn.synsets()`` function has two (optional) parameters, ``literal`` and ``pos``. If we specify a literal it will return all synset IDs that contain that literal. If we don't specify a literal, we will obtain a list of all existing synsets. The pos parameter filters by part of speech: NOUN, VERB, ADVERB or ADJECTIVE. The function returns a list of synset IDs.\n\n```python \nsynset_ids_all = wn.synsets() # get all synset IDs in RoWordNet\nsynset_ids_verbs = wn.synsets(pos=Synset.Pos.VERB) # get all verb synset IDs\nsynset_ids = wn.synsets(literal=\"cal\", pos=Synset.Pos.NOUN) # get all synset IDs that contain word \"cal\" and are nouns\n```\n\nFor example we want to list all synsets containing word \"cal\":\n\n```python\nword = 'cal'\nprint(\"Search for all noun synsets that contain word/literal '{}'\".format(word)) \nsynset_ids = wn.synsets(literal=word, pos=Synset.Pos.NOUN)\nfor synset_id in synset_ids:\n print(wn.synset(synset_id))\n```\nwill output:\n```\nSearch for all noun synsets that contain word/literal 'cal'\nSynset(id='ENG30-03624767-n', literals=['cal'], definition='pies\u0103 la jocul de \u0219ah de forma unui cap de cal')\nSynset(id='ENG30-03538037-n', literals=['cal'], definition='Nume dat unor aparate sau piese asem\u0103n\u0103toare cu un cal :')\nSynset(id='ENG30-02376918-n', literals=['cal'], definition='Masculul speciei Equus caballus')\n````\n\n\n### Relations access\n\nSynsets are linked by relations (encoded as directed edges in a graph). A synset usually has outbound as well as inbound relation, To obtain the outbound relations of a synset use ``wn.outbound_relations()`` with the synset id as parameter. The result is a list of tuples like ``(synset_id, relation)`` encoding the target synset and the relation that starts from the current synset (given as parameter) to the target synset.\n\n```python \nsynset_id = wn.synsets(\"tren\")[2] # select the third synset from all synsets containing word \"tren\"\nprint(\"\\nPrint all outbound relations of {}\".format(wn.synset(synset_id)))\n outbound_relations = wn.outbound_relations(synset_id)\n for outbound_relation in outbound_relations:\n target_synset_id = outbound_relation[0] \n relation = outbound_relation[1]\n print(\"\\tRelation [{}] to synset {}\".format(relation,wn.synset(target_synset_id)))\n```\nWill output (amongst other relations):\n``` \nPrint all outbound relations of Synset(id='ENG30-04468005-n', literals=['tren'], definition='Convoi de vagoane de cale ferat\u0103 legate \u00eentre \u0219i puse \u00een mi\u0219care de o locomotiv\u0103.')\n Relation [hypernym] to synset Synset(id='ENG30-04019101-n', literals=['transport_public'], definition='transportarea pasagerilor sau postei')\n Relation [hyponym] to synset Synset(id='ENG30-03394480-n', literals=['marfar', 'tren_de_marf\u0103'], definition='tren format din vagoane de marf\u0103')\n Relation [member_meronym] to synset Synset(id='ENG30-03684823-n', literals=['locomotiv\u0103', 'ma\u0219in\u0103'], definition='Vehicul motor de cale ferat\u0103, cu surs\u0103 de energie proprie sau str\u0103in\u0103, folosind pentru a remorca \u0219i a deplasa vagoanele.')\n````\nThis means that from the current synset there are three relations pointing to other synsets: the first relation means that \"tren\" is-a (hypernym) \"transport\\_public\"; the second relation is a hyponym, meaning that \"marfar\" is-a \"tren\"; the third member_meronym relation meaning that \"locomotiva\" is a part-of \"tren\".\n\nThe ``wn.inbound_relations()`` works identically but provides a list of _incoming_ relations to the synset provided as the function parameter, while ``wn.relations()`` provides allboth inbound and outbound relations to/from a synset (note: usually wn.relations() is provided as a convenience and is used for information/printing purposes as the returned tuple list looses directionality)\n\n\n\n## Credits\n\nIf you decide to use this work in a scientific paper, please consider citing the following paper as a thank you to the authors of the actual Romanian WordNet data:\n\n```\nDan Tufi\u015f, Verginica Barbu Mititelu, The Lexical Ontology for Romanian, in Nuria Gala, Reinhard Rapp, Nuria Bel-Enguix (Ed.), Language Production, Cognition, and the Lexicon, series Text, Speech and Language Technology, vol. 48, Springer, 2014, p. 491-504.\n```\nor in .bib format:\n\n```\n@InBook{DTVBMzock,title = \"The Lexical Ontology for Romanian\",author = \"Tufi\u0219, Dan and Barbu Mititelu, Verginica\",booktitle = \"Language Production, Cognition, and the Lexicon\",editor = \"Nuria Gala, Reinhard Rapp, Nuria Bel-Enguix\",series = \"Text, Speech and Language Technology\",volume = \"48\",year = \"2014\",publisher = \"Springer\",pages = \"491-504\"}\n```\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/dumitrescustefan/RoWordNet", "keywords": "romanian wordnet rowordnet rown python", "license": "", "maintainer": "Dumitrescu Stefan and Andrei Marius Avram", "maintainer_email": "dumitrescu.stefan@gmail.com, avramandrei9666@gmail.com", "name": "rowordnet", "package_url": "https://pypi.org/project/rowordnet/", "platform": "", "project_url": "https://pypi.org/project/rowordnet/", "project_urls": { "Homepage": "https://github.com/dumitrescustefan/RoWordNet" }, "release_url": "https://pypi.org/project/rowordnet/0.9.3/", "requires_dist": [ "lxml", "networkx" ], "requires_python": "", "summary": "Python API for the Romanian WordNet", "version": "0.9.3" }, "last_serial": 4709500, "releases": { "0.9.3": [ { "comment_text": "", "digests": { "md5": "0c7947198ba5d7f87a21f297600dfb00", "sha256": "34c64b91f98f99b3c99ad0c818bc358244b14fbb573f3457be0eed0e5734cb35" }, "downloads": -1, "filename": "rowordnet-0.9.3-py3-none-any.whl", "has_sig": false, "md5_digest": "0c7947198ba5d7f87a21f297600dfb00", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11505765, "upload_time": "2018-09-11T09:17:04", "url": "https://files.pythonhosted.org/packages/bf/5c/a1af3a2f191f472cdbe8c9fb001e891c9eaf55166db07d6f789ad12c3def/rowordnet-0.9.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "163d315ad796559d040e67a87d6dcb4d", "sha256": "a1947ed255e8edbf416f14dc438b787e39c9f0480849acbfd7ac5692ddbd6d1d" }, "downloads": -1, "filename": "rowordnet-0.9.3.tar.gz", "has_sig": false, "md5_digest": "163d315ad796559d040e67a87d6dcb4d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11367424, "upload_time": "2018-09-11T09:17:09", "url": "https://files.pythonhosted.org/packages/44/11/424132ee8536efc451ccc7f6b587be9345dd1fec52c04fe46d25893bbb29/rowordnet-0.9.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "0c7947198ba5d7f87a21f297600dfb00", "sha256": "34c64b91f98f99b3c99ad0c818bc358244b14fbb573f3457be0eed0e5734cb35" }, "downloads": -1, "filename": "rowordnet-0.9.3-py3-none-any.whl", "has_sig": false, "md5_digest": "0c7947198ba5d7f87a21f297600dfb00", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11505765, "upload_time": "2018-09-11T09:17:04", "url": "https://files.pythonhosted.org/packages/bf/5c/a1af3a2f191f472cdbe8c9fb001e891c9eaf55166db07d6f789ad12c3def/rowordnet-0.9.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "163d315ad796559d040e67a87d6dcb4d", "sha256": "a1947ed255e8edbf416f14dc438b787e39c9f0480849acbfd7ac5692ddbd6d1d" }, "downloads": -1, "filename": "rowordnet-0.9.3.tar.gz", "has_sig": false, "md5_digest": "163d315ad796559d040e67a87d6dcb4d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11367424, "upload_time": "2018-09-11T09:17:09", "url": "https://files.pythonhosted.org/packages/44/11/424132ee8536efc451ccc7f6b587be9345dd1fec52c04fe46d25893bbb29/rowordnet-0.9.3.tar.gz" } ] }