{ "info": { "author": "Konstantin Lopuhin", "author_email": "kostia.lopuhin@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "Labeled Russian Context for WSD\n===============================\n\nContexts sampled from RuTenTen and RNC. Sense definitions from Active Dictionary.\nSome words have two annotators. Number of contexts is 100 for most words\nand 500 for 7 words.\n\nAnnotators (words):\n\n- \u0410\u043d\u0430\u0441\u0442\u0430\u0441\u0438\u044f \u041b\u043e\u043f\u0443\u0445\u0438\u043d\u0430 (47)\n- \u041a\u043e\u043d\u0441\u0442\u0430\u043d\u0442\u0438\u043d \u041b\u043e\u043f\u0443\u0445\u0438\u043d (11)\n- \u0410\u043b\u0435\u043a\u0441\u0430\u043d\u0434\u0440\u0430 \u0423\u0434\u0430\u043b\u044c\u0446\u043e\u0432\u0430 (2)\n- \u0410\u043d\u0430\u0441\u0442\u0430\u0441\u0438\u044f \u041a. (2)\n- \u0410\u043d\u043d\u0430 \u041a\u043e\u0442 (2)\n- \u0410\u043d\u043d\u0430 \u0422\u0430\u0442\u0430\u0440\u0435\u043d\u043a\u043e (2)\n- \u0411\u043e\u0440\u0438\u0441 \u0418\u043e\u043c\u0434\u0438\u043d (2)\n- \u0418\u0432\u0430\u043d \u0421\u0430\u043c\u043e\u0439\u043b\u0435\u043d\u043a\u043e (1)\n\nContexts are stored in ``rl_wsd_labeled/``::\n\n rl_wsd_labeled\n \u251c\u2500\u2500 adjectives\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 RuTenTen\n \u251c\u2500\u2500 nouns\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 RNC\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 RuTenTen\n \u2514\u2500\u2500 verbs\n \u2514\u2500\u2500 RuTenTen\n\nA python interface is provided. Intall the package first::\n\n pip install rl_wsd_labeled\n\nand then in order to get labeled contexts::\n\n >>> import rl_wsd_labeled\n >>> f = rl_wsd_labeled.contexts_filename('nouns', 'RuTenTen', '\u0433\u043e\u0440\u0448\u043e\u043a')\n >>> rl_wsd_labeled.get_contexts(f)\n\n ({'1': '\u041e\u043a\u0440\u0443\u0433\u043b\u044b\u0439 \u0433\u043b\u0438\u043d\u044f\u043d\u044b\u0439 \u0441\u043e\u0441\u0443\u0434 \u0434\u043b\u044f \u043f\u0440\u0438\u0433\u043e\u0442\u043e\u0432\u043b\u0435\u043d\u0438\u044f \u043f\u0438\u0449\u0438 (\u043f\u0435\u0447\u043d\u043e\u0439 \u0433\u043e\u0440\u0448\u043e\u043a)',\n '2': '\u0420\u0430\u0441\u0448\u0438\u0440\u044f\u044e\u0449\u0438\u0439\u0441\u044f \u043a\u0432\u0435\u0440\u0445\u0443 \u0441\u043e\u0441\u0443\u0434 \u0441 \u043e\u0442\u0432\u0435\u0440\u0441\u0442\u0438\u0435\u043c \u0432 \u0434\u043d\u0435 (\u0446\u0432\u0435\u0442\u043e\u0447\u043d\u044b\u0439 \u0433\u043e\u0440\u0448\u043e\u043a)',\n '3': '\u041d\u043e\u0447\u043d\u043e\u0439 \u0433\u043e\u0440\u0448\u043e\u043a'},\n [(('\u0442\u0435\u043b\u0435\u0432\u0438\u0437\u043e\u0440, - \u043a\u043e\u0432\u0435\u0440, , - \u043c\u0443\u0437\u044b\u043a\u0430\u043b\u044c\u043d\u044b\u0439 \u0446\u0435\u043d\u0442\u0440, - \u0441\u0442\u043e\u043b, - \u0430\u043a\u0432\u0430\u0440\u0438\u0443\u043c, - 3 \u0448\u043a\u0430\u0444\u0430, - \u0446\u0432\u0435\u0442\u044b \u0432',\n ' \u0433\u043e\u0440\u0448\u043a\u0430\u0445',\n ', - \u043c\u0435\u043b\u043a\u0438\u0435 \u0430\u043a\u0441\u0435\u0441\u0441\u0443\u0430\u0440\u044b.'),\n '2'),\n ...\n (('\u0438\u0431\u043e \u043d\u0430\u0441\u0442\u0430\u043d\u0435\u0442 \u0441\u0440\u043e\u043a \u0438 \u043e\u043d\u043e \u0431\u0443\u0434\u0435\u0442 \u0440\u0430\u0437\u0440\u0443\u0448\u0435\u043d\u043e \u0442\u0435\u0447\u0435\u043d\u0438\u0435\u043c \u0432\u0440\u0435\u043c\u0435\u043d\u0438 \u043b\u0438\u0431\u043e \u0432\u043e\u0439\u043d\u043e\u044e, \u0431\u0443\u0434\u0442\u043e \u0441\u0442\u0430\u0440\u044b\u0439',\n ' \u0433\u043e\u0440\u0448\u043e\u043a',\n ' \u0441 \u0432\u0438\u043d\u043e\u043c \u0432 \u0442\u0440\u044e\u043c\u0435 \u0442\u043e\u0440\u0433\u043e\u0432\u043e\u0433\u043e \u043a\u043e\u0440\u0430\u0431\u043b\u044f, \u043f\u043e\u043f\u0430\u0432\u0448\u0435\u0433\u043e \u0432 \u0431\u0443\u0440\u044e \u0438 \u0440\u0430\u0437\u0431\u0438\u0432\u0448\u0435\u0433\u043e\u0441\u044f \u043e \u0441\u043a\u0430\u043b\u044b.'),\n '1')\n ])\n\nApart from senses, there are two special annotations: \"0\" means\n\"I don't know/the context is unclear/the contexts is invalid\", and \"max sense + 1\"\nmean \"other sense, not listed among given senses\". Contexts marked as \"0\" or \"other\"\nare not returned, unless ``with_skipped=True`` is passed.\nIf there was more then one annotator, contexts where annotators did not agree are also\nnot included. There is a function ``rl_wsd_labeled.get_agreement`` that returns the\nratio of senses where where both annotators gave either the\nsame concrete sense, or both skipped the senses (so \"0\" and \"other\" are considered equal).\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/lopuhin/ruslang-wsd-labeled/", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "rl_wsd_labeled", "package_url": "https://pypi.org/project/rl_wsd_labeled/", "platform": "", "project_url": "https://pypi.org/project/rl_wsd_labeled/", "project_urls": { "Homepage": "https://github.com/lopuhin/ruslang-wsd-labeled/" }, "release_url": "https://pypi.org/project/rl_wsd_labeled/0.1.1/", "requires_dist": null, "requires_python": "", "summary": "Labeled contexts of Russian polysemous words", "version": "0.1.1" }, "last_serial": 3129722, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "67c4dc67c5c11e6218676bbfba8a8870", "sha256": "1f641434356858df4ed90fa9a96fe27a6dfd91087c667c30ba3381b062b319cd" }, "downloads": -1, "filename": "rl_wsd_labeled-0.0.2.tar.gz", "has_sig": false, "md5_digest": "67c4dc67c5c11e6218676bbfba8a8870", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 979787, "upload_time": "2017-04-19T18:03:16", "url": "https://files.pythonhosted.org/packages/d2/e7/e1d86e9ac90ecba98fd9f1eb8a53c9f4084983378d346242e40326d7d1f5/rl_wsd_labeled-0.0.2.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "e030a37b5d1253792c76e65fc7f9f71a", "sha256": "69682c678ba77b96d22ffcab5785fef03db46676332b75cabc751efdcd420164" }, "downloads": -1, "filename": "rl_wsd_labeled-0.1.0.tar.gz", "has_sig": false, "md5_digest": "e030a37b5d1253792c76e65fc7f9f71a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1255936, "upload_time": "2017-08-28T18:29:30", "url": "https://files.pythonhosted.org/packages/7e/65/242229a2b5251cfa1c814e1bf0ea4c6bf060fe687ae4e66936754d7eea59/rl_wsd_labeled-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "9faffd21350a5e78f586e139aadb3df4", "sha256": "9b1ad7e66e8d3f8cf249220b9284cdc31e235d2e270d1aa152cf218b558ee482" }, "downloads": -1, "filename": "rl_wsd_labeled-0.1.1.tar.gz", "has_sig": false, "md5_digest": "9faffd21350a5e78f586e139aadb3df4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1257693, "upload_time": "2017-08-28T18:40:03", "url": "https://files.pythonhosted.org/packages/c1/9d/8637fb2ca2df57cdc2eb344d9db3d59bdfe224c63a05889fde2f99eb86df/rl_wsd_labeled-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9faffd21350a5e78f586e139aadb3df4", "sha256": "9b1ad7e66e8d3f8cf249220b9284cdc31e235d2e270d1aa152cf218b558ee482" }, "downloads": -1, "filename": "rl_wsd_labeled-0.1.1.tar.gz", "has_sig": false, "md5_digest": "9faffd21350a5e78f586e139aadb3df4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1257693, "upload_time": "2017-08-28T18:40:03", "url": "https://files.pythonhosted.org/packages/c1/9d/8637fb2ca2df57cdc2eb344d9db3d59bdfe224c63a05889fde2f99eb86df/rl_wsd_labeled-0.1.1.tar.gz" } ] }