{ "info": { "author": "Will Roberts", "author_email": "wildwilhelm@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: German", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.2", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Topic :: Scientific/Engineering", "Topic :: Text Processing :: Linguistic" ], "description": "============\n pygermanet\n============\n\n.. image:: https://img.shields.io/pypi/v/pygermanet.svg\n :target: https://pypi.python.org/pypi/pygermanet/\n :alt: Latest Version\n\nGermaNet API for Python.\n\nCopyright (c) 23 March, 2014 Will Roberts .\n\nGermaNet_ is the German WordNet_, a machine-readable lexical semantic\nresource which lists nouns, verbs and adjectives in German, along with\nlexical relations which connect these words together into a semantic\nnetwork. This library gives a Python interface to this resource.\n\n.. _GermaNet: http://www.sfs.uni-tuebingen.de/GermaNet/\n.. _WordNet: http://wordnet.princeton.edu/\n\n\nIntroduction\n------------\n\nStart GermaNet by connecting to the MongoDB_ database which contains\nthe lexical information (for setting up the MongoDB database, see the\nsection `Setup`_, below). On the local machine using the default\nport, this is as simple as::\n\n >>> from pygermanet import load_germanet\n >>> gn = load_germanet()\n\nYou can search GermaNet for synsets containing a particular lemmatised\nword form using the ``synsets`` function::\n\n >>> gn.synsets('gehen')\n [Synset(auseinandergehen.v.3),\n Synset(funktionieren.v.1),\n Synset(funktionieren.v.2),\n Synset(gehen.v.1),\n Synset(gehen.v.4),\n Synset(gehen.v.5),\n Synset(gehen.v.6),\n Synset(gehen.v.7),\n Synset(gehen.v.9),\n Synset(gehen.v.10),\n Synset(gehen.v.11),\n Synset(gehen.v.12),\n Synset(gehen.v.13),\n Synset(gehen.v.14),\n Synset(handeln.v.1)]\n\nTo look up synsets, you must have the canonical form of the word as it\nwould appear in a dictionary (head word); this module calls this word\nform the *lemma*. The GermaNet instance can perform lemmatisation of\nwords using data derived from the `Projekt deutscher Wortschatz`_::\n\n >>> gn.lemmatise(u'ginge')\n [u'gehen']\n\n.. _Projekt deutscher Wortschatz: http://wortschatz.uni-leipzig.de/\n\nEach ``Synset`` is represented by the orthographic form, part of speech,\nand sense number of its first ``Lemma``; this acts as a unique\nidentifier for synsets. If you know this identifier, you can also\nlook up a synset in GermaNet::\n\n >>> funktionieren = gn.synset(u'funktionieren.v.2')\n >>> funktionieren\n Synset(funktionieren.v.2)\n\n``Synset`` objects have data members and methods::\n\n >>> funktionieren.hyponyms\n [Synset(vorgehen.v.1), Synset(leerlaufen.v.2)]\n >>> gn.synset('Husky.n.1').hypernym_paths\n [[Synset(GNROOT.n.1),\n Synset(Entit\u00e4t.n.2),\n Synset(Objekt.n.4),\n Synset(Ding.n.2),\n Synset(Teil.n.2),\n Synset(Teilmenge.n.2),\n Synset(Gruppe.n.1),\n Synset(biologische Gruppe.n.1),\n Synset(Spezies.n.1),\n Synset(Rasse.n.1),\n Synset(Tierrasse.n.1),\n Synset(Hunderasse.n.1),\n Synset(Husky.n.1)],\n [Synset(GNROOT.n.1),\n Synset(Entit\u00e4t.n.2),\n Synset(kognitives Objekt.n.1),\n Synset(Kategorie.n.1),\n Synset(Art.n.1),\n Synset(Spezies.n.1),\n Synset(Rasse.n.1),\n Synset(Tierrasse.n.1),\n Synset(Hunderasse.n.1),\n Synset(Husky.n.1)],\n [Synset(GNROOT.n.1),\n Synset(Entit\u00e4t.n.2),\n Synset(Objekt.n.4),\n Synset(nat\u00fcrliches Objekt.n.1),\n Synset(Wesenheit.n.1),\n Synset(Organismus.n.1),\n Synset(h\u00f6heres Lebewesen.n.1),\n Synset(Tier.n.1),\n Synset(Gewebetier.n.1),\n Synset(Chordatier.n.1),\n Synset(Wirbeltier.n.1),\n Synset(S\u00e4ugetier.n.1),\n Synset(Plazentatier.n.1),\n Synset(Beutegreifer.n.1),\n Synset(Landraubtier.n.1),\n Synset(hundeartiges Landraubtier.n.1),\n Synset(Hund.n.2),\n Synset(Husky.n.1)],\n [Synset(GNROOT.n.1),\n Synset(Entit\u00e4t.n.2),\n Synset(Objekt.n.4),\n Synset(nat\u00fcrliches Objekt.n.1),\n Synset(Wesenheit.n.1),\n Synset(Organismus.n.1),\n Synset(h\u00f6heres Lebewesen.n.1),\n Synset(Tier.n.1),\n Synset(Haustier.n.1),\n Synset(Hund.n.2),\n Synset(Husky.n.1)]]\n\nEach ``Synset`` contains one or more ``Lemma`` objects::\n\n >>> funktionieren.lemmas\n [Lemma(funktionieren.v.2.funktionieren),\n Lemma(funktionieren.v.2.funzen),\n Lemma(funktionieren.v.2.gehen),\n Lemma(funktionieren.v.2.laufen),\n Lemma(funktionieren.v.2.arbeiten)]\n\nA given orthographic form may be represented by multiple ``Lemma``\nobjects belonging to different ``Synset`` objects::\n\n >>> gn.lemmas('brennen')\n [Lemma(brennen.v.1.brennen),\n Lemma(verbrennen.v.1.brennen),\n Lemma(brennen.v.3.brennen),\n Lemma(brennen.v.4.brennen),\n Lemma(brennen.v.5.brennen),\n Lemma(destillieren.v.1.brennen),\n Lemma(brennen.v.7.brennen),\n Lemma(brennen.v.8.brennen)]\n\nSemantic Similarity\n-------------------\n\npygermanet includes several functions for calculating semantic\nsimilarity and semantic distance, somewhat like `WN::Similarity`_.\nThese metrics use word frequency information estimated on the SdeWaC_\ncorpus and then automatically lemmatised using the TreeTagger_.\n\nThe probability of encountering an instance of a given synset *s* is\nestimated over SdeWaC using the procedure described by `Resnik (1995)`_.\nBriefly, for each instance of a noun in the corpus, we find the set of\nsynsets *S* containing a sense of that noun; each of these synsets is then\ncredited with a count of 1/*|S|*. A count added to a synset is\nalso added to all of its hypernyms, so that count observations are\npropagated up the taxonomy. By dividing by the total number of noun\ninstances in the corpus, each synset is assigned a probability value;\nthese probabilities increase monotonically up the taxonomy, and the\nroot node has *p = 1*.\n\nUsing\nthis interface, we can replicate the results of `(Gurevych, 2005)`_\nand `(Gurevych and Niederlich, 2005)`_, who collected human semantic\nsimilarity judgements on 65 word pairs and then measured the\ncorrelation of these judgements against similarity scores reported by\nvarious automatic similarity metrics. These two papers reported\nPearson's *r* of 0.715 for (Resnik, 1995), 0.738 for a normalised\ndistance version of `(Jiang and Conrath, 1997)`_, and 0.734 for\n`(Lin, 1998)`_, with inter-annotator agreement of 0.810.\n\nReplication of the two studies, using the gur65_ data set::\n\n from pygermanet import load_germanet, Synset\n from scipy.stats.stats import pearsonr\n import codecs\n import numpy as np\n\n GUR65_FILENAME = 'gur65.csv'\n\n def load_gurevych():\n gur65 = []\n with codecs.open(GUR65_FILENAME, 'r', 'latin-1') as input_file:\n for idx, line in enumerate(input_file):\n fields = line.strip().split(';')\n if idx == 0:\n header = fields\n else:\n # fix typo in gur65\n fields[1] = {'Reis': 'Reise'}.get(fields[1], fields[1])\n fields[2] = float(fields[2])\n fields[3] = float(fields[3])\n gur65.append(fields)\n gur65 = np.core.records.array(\n gur65,\n dtype=np.dtype({'formats': ['U30', 'U30', '>> from pygermanet import load_germanet\n >>> gn = load_germanet()\n\nLicense\n-------\n\nThis README file and the source code in this library are licensed\nunder the MIT License (see source file LICENSE.txt for details).\n\nThe file ``baseforms_by_projekt_deutscher_wortschatz.txt.gz`` contains\ndata derived from the `Projekt deutscher Wortschatz`_; this database\nis free for educational and researching purposes but not for\ncommercial use. For more information visit:\nhttp://wortschatz.uni-leipzig.de/.\n\nThe file ``sdewac-gn-words.tsv.gz`` contains data derived from the\nSdeWaC_ corpus, whose license and copyright status are\nslightly unclear to me. I'm releasing the file under the\n`Creative Commons Attribution 4.0 International License`_.\n\n.. _`Creative Commons Attribution 4.0 International License`: http://creativecommons.org/licenses/by/4.0/\n\nHistory\n-------\n\nThe NLTK_ project had an API once upon a time for interacting with\nGermaNet, but this has now been removed from the current NLTK\ndistribution. This API was called GermaNLTK_ and was described in\nsome detail in `NLTK Issue 604`_. pygermanet shamelessly imitates the\ninterface of this older NLTK code, which was, in turn, based on the\nstandard NLTK interface to WordNet.\n\n.. _NLTK: http://www.nltk.org/\n.. _GermaNLTK: https://docs.google.com/document/d/1rdn0hOnJNcOBWEZgipdDfSyjJdnv_sinuAUSDSpiQns/edit?hl=en\n.. _NLTK Issue 604: https://code.google.com/p/nltk/issues/detail?id=604\n\nThe GermaNLTK project had a script to push the content of the XML\nfiles into a sqlite database; I haven't tested this code myself, and\nthe GermaNet database has changed over the years since GermaNLTK was\nwritten. This ``mongo_import.py`` script included in this library does much the\nsame thing, and could easily be adapted to use sqlite as a back end\ninstead of MongoDB.\n\n\nContributors\n------------\n\nThanks to stefanpernes_ for his work on making a Python 3 port of\npygermanet.\n\n.. _stefanpernes: https://github.com/stefanpernes", "description_content_type": null, "docs_url": null, "download_url": null, "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/wroberts/pygermanet", "keywords": "germanet nlp wordnet", "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "pygermanet", "package_url": "https://pypi.org/project/pygermanet/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/pygermanet/", "project_urls": { "Homepage": "https://github.com/wroberts/pygermanet" }, "release_url": "https://pypi.org/project/pygermanet/1.0.2/", "requires_dist": [ "pymongo", "future (>=0.14)" ], "requires_python": null, "summary": "GermaNet API for Python", "version": "1.0.2" }, "last_serial": 2453264, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "9aabefecc996409edd6ffeb32574b0f1", "sha256": "7483da29b93611bbfbc648709dbe28fa06ec1048f2f2e4dcf68c6a52f679d524" }, "downloads": -1, "filename": "pygermanet-1.0.0-py2-none-any.whl", "has_sig": false, "md5_digest": "9aabefecc996409edd6ffeb32574b0f1", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 8595948, "upload_time": "2015-02-17T15:39:32", "url": "https://files.pythonhosted.org/packages/5b/6b/83cfaac6a7917a221bb6ad63ce789d3637c909c0c421a6765580252ff07d/pygermanet-1.0.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "fc3afee25f606c66b2c0e3ea80626c16", "sha256": "e96d044233148ce1918bde83f99b5be927ef1bd90e049ba87206cded2e464912" }, "downloads": -1, "filename": "pygermanet-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "fc3afee25f606c66b2c0e3ea80626c16", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8595949, "upload_time": "2015-02-17T15:41:55", "url": "https://files.pythonhosted.org/packages/85/32/caa0bf49dc84396b308435260aa5c1790a6fe784800eac91dc0e3b9ecdb0/pygermanet-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c113913bfec72e63c8c835d4dfe49683", "sha256": "9c8f4ead18f954339cf1f0ee84c5a901ae24b3bc6075fec71baa1ef73343f9d0" }, "downloads": -1, "filename": "pygermanet-1.0.0.tar.gz", "has_sig": false, "md5_digest": "c113913bfec72e63c8c835d4dfe49683", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8600558, "upload_time": "2015-02-17T15:44:27", "url": "https://files.pythonhosted.org/packages/e4/bb/e0cc0ba5e5abd62dcb3946ffd05607f5bc6921348382e087391e61960c38/pygermanet-1.0.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "ac1d51ed2e4fcf657ec446da6a756dc9", "sha256": "2bb3d8b9358f77fbd39b76f90c5f3bd90a9b2823c0198fded50c1d27c2c8b9b0" }, "downloads": -1, "filename": "pygermanet-1.0.1-py2-none-any.whl", "has_sig": false, "md5_digest": "ac1d51ed2e4fcf657ec446da6a756dc9", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 8596008, "upload_time": "2015-02-17T17:12:07", "url": "https://files.pythonhosted.org/packages/ae/33/2652216f1bee3f1376df71500c817948ef0ca2436e6187f645ceda15ee2e/pygermanet-1.0.1-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f75e781d7e368f9909e1da7893c16fcf", "sha256": "4ac5d4d0a522100a34d76d9c35073ddc9273112f63769a0514d7b24b4e5abf9f" }, "downloads": -1, "filename": "pygermanet-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "f75e781d7e368f9909e1da7893c16fcf", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8595981, "upload_time": "2015-02-17T17:14:30", "url": "https://files.pythonhosted.org/packages/42/f2/172aafb5426993a42cbf380133372e639706b617abc17e6eacca07cc1c20/pygermanet-1.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cf5e4c187c0cf6718455da14fd1aeee7", "sha256": "c07adbe66e908b4f4716489368425899e6c2e494b7c7cdf6b20d0c9fd6559fe9" }, "downloads": -1, "filename": "pygermanet-1.0.1.tar.gz", "has_sig": false, "md5_digest": "cf5e4c187c0cf6718455da14fd1aeee7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8600683, "upload_time": "2015-02-17T17:16:53", "url": "https://files.pythonhosted.org/packages/27/89/1a8cb6b7d5dc310c22d01263c0fe57fa85329c28cd238bd1eab9308a2071/pygermanet-1.0.1.tar.gz" } ], "1.0.2": [ { "comment_text": "", "digests": { "md5": "93dca681061bb1211d5f7e1b812db1ba", "sha256": "68d4c3e5e8da1b675e2dacb5f6b63f58d311e8aea8fcdcf90dcee595d3e52e1c" }, "downloads": -1, "filename": "pygermanet-1.0.2-py2-none-any.whl", "has_sig": false, "md5_digest": "93dca681061bb1211d5f7e1b812db1ba", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 8596875, "upload_time": "2016-11-10T15:49:41", "url": "https://files.pythonhosted.org/packages/aa/b7/56427fae5b1ba5c9fa31f8b1e93155074edaf62df3d3f0b8f08bd2a7f691/pygermanet-1.0.2-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a2a91d21d8cb504255f5a393eb3489cf", "sha256": "2dab9d20177f7111fdb3415d72212293b54380972a870cc95640ace24f385e03" }, "downloads": -1, "filename": "pygermanet-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "a2a91d21d8cb504255f5a393eb3489cf", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8596855, "upload_time": "2016-11-10T15:49:56", "url": "https://files.pythonhosted.org/packages/1f/c1/e5e2f4195dfb7cf0228e2ad706ce201931ab427ac734e1c68841d76e9b4c/pygermanet-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "44d1ce111b49a42b989950904424ef9a", "sha256": "dd47b2512c07c94e7444da1443026f782d44ed74fab86d87b1c3646d97cb7d45" }, "downloads": -1, "filename": "pygermanet-1.0.2.tar.gz", "has_sig": false, "md5_digest": "44d1ce111b49a42b989950904424ef9a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8601347, "upload_time": "2016-11-10T15:50:11", "url": "https://files.pythonhosted.org/packages/c6/60/15a69272514d8b8f7239225c4b80b219753bf7be76df99283e2d30a1354a/pygermanet-1.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "93dca681061bb1211d5f7e1b812db1ba", "sha256": "68d4c3e5e8da1b675e2dacb5f6b63f58d311e8aea8fcdcf90dcee595d3e52e1c" }, "downloads": -1, "filename": "pygermanet-1.0.2-py2-none-any.whl", "has_sig": false, "md5_digest": "93dca681061bb1211d5f7e1b812db1ba", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 8596875, "upload_time": "2016-11-10T15:49:41", "url": "https://files.pythonhosted.org/packages/aa/b7/56427fae5b1ba5c9fa31f8b1e93155074edaf62df3d3f0b8f08bd2a7f691/pygermanet-1.0.2-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a2a91d21d8cb504255f5a393eb3489cf", "sha256": "2dab9d20177f7111fdb3415d72212293b54380972a870cc95640ace24f385e03" }, "downloads": -1, "filename": "pygermanet-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "a2a91d21d8cb504255f5a393eb3489cf", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8596855, "upload_time": "2016-11-10T15:49:56", "url": "https://files.pythonhosted.org/packages/1f/c1/e5e2f4195dfb7cf0228e2ad706ce201931ab427ac734e1c68841d76e9b4c/pygermanet-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "44d1ce111b49a42b989950904424ef9a", "sha256": "dd47b2512c07c94e7444da1443026f782d44ed74fab86d87b1c3646d97cb7d45" }, "downloads": -1, "filename": "pygermanet-1.0.2.tar.gz", "has_sig": false, "md5_digest": "44d1ce111b49a42b989950904424ef9a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8601347, "upload_time": "2016-11-10T15:50:11", "url": "https://files.pythonhosted.org/packages/c6/60/15a69272514d8b8f7239225c4b80b219753bf7be76df99283e2d30a1354a/pygermanet-1.0.2.tar.gz" } ] }