{ "info": { "author": "Syntpump", "author_email": "lynnporu@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License" ], "description": "[![Pypi](https://img.shields.io/pypi/v/dclua.svg)](https://pypi.python.org/pypi/dclua)\n\n# Declensor library\n\nUsing `dclua.py` library you can train declension models and decline words. This will just replace suffix of the word to correspond new morphological properties you want the word to have. Here's some topics that will help you understand how it works.\n\n## Morphological vector\nMorphological vector is a vector which determines morphology properties for the lexeme. Number on each coordinate determine some property. You can use your vectors for your language, but here's the structure, which is suggested to use for Ukrainian.\n\n### Noun vectors\nNoun vectors has 2 coordinates: `[number][case]`. Here's the table, what means each value.\n\n| Coordinate | Number | Case |\n|------------|----------|---------------|\n| 0 | | Nominative |\n| 1 | Singular | Genitive |\n| 2 | Plural | Dative |\n| 3 | | Accusative |\n| 4 | | Instrumental |\n| 5 | | Locative |\n| 6 | | Vocative |\n\nInfinitive suffix placed at `[0][0]`.\n\n### Verbs vectors\nNoun vectors has 4 coordinates: `[tense][person][number][gender]`.\n\n| Coordinate | Tense | Person | Number | Gender |\n|------------|----------|---------------|----------|-----------|\n| 0 | | First | Singular | Masculine |\n| 1 | Present | Second | Plural | Feminine |\n| 2 | Future | Third | | Neutral |\n| 3 | Past | | | |\n\nInfinitive suffix placed at `[0][0][0][0]`.\n\n### Adjective vectors\nNoun vectors has 3 coordinates: `[gender][number][person]`.\n\n| Coordinate | Gender | Number | Person |\n|------------|-----------|----------|---------------|\n| 0 | | Singular | First |\n| 1 | Masculine | Plural | Second |\n| 2 | Feminine | | Third |\n| 3 | Neutral | | |\n\nInfinitive suffix placed at `[0][0][0]`.\n\n## Declension rule\nDeclension rule is a multidimensional array which contains declensed suffixes, which is indexed using [morphology vectors](#morphological-vector). You can create such model for your word in this way:\n\n```python\nrule = dclua.DeclenseTrainer.analyze({\n (0,0): '\u0443\u0441\u043c\u0456\u0448\u043a\u0430',\n (1,0): '\u0443\u0441\u043c\u0456\u0448\u043a\u0430',\n (1,1): '\u0443\u0441\u043c\u0456\u0448\u043a\u0438',\n (1,2): '\u0443\u0441\u043c\u0456\u0448\u0446\u0456',\n #...\n (2,6): '\u0443\u0441\u043c\u0456\u0448\u043a\u0438'\n});\n```\n\nNow the rule will look like this:\n```python\nrule[0][0] == '\u043a\u0430'\nrule[1][0] == '\u043a\u0430'\nrule[1][1] == '\u043a\u0438'\nrule[1][2] == '\u0446\u0456'\n#...\nrule[2][6] == '\u043a\u0438'\n```\n\nEvery word has its suffix, so you need to create rule for each of them in order to use in the future.\n\n`analyze` method also accept `minsize` argument, which determine size of the minimal producing suffix.\n\n## Word declension\nOnce you have model (bundle of rules) for different suffixes, you can use them to decline words. The syntax is following:\n```\nDeclensor.declense(str word, tuple newmporph, tuple morphology=None)\n```\nSuppose, you have `model` variable, which contains models for all suffixes we want. Then you can decline words in the following way:\n```python\n>>> dcl = dclua.Declensor(model)\n>>> dcl.declense('\u0441\u043e\u043d\u0446\u044e', (1,1))\n<<< '\u0441\u043e\u043d\u0446\u044f'\n```\nThe morphology vector of given word will be recognized automatically, so it may take some time to found appropriate declension model in `models`. If you already know the morphology of the word you want to declense, assign it to the `morphology` argument:\n```python\n>>> dcl = dclua.Declensor(model)\n>>> dcl.declense('\u0441\u043e\u043d\u0446\u044e', (1,1), morphology=(1,2))\n<<< '\u0441\u043e\u043d\u0446\u044f'\n```\n\n## Train your model\nIn order to train your model you can use template from `template.py` in this directory.\n\n### Generalizing model\nSometimes suffix in a model can appear in slight variations. For example, `aab`, `aac`: only the last letter is different. You can set up groups of letters, which can differ in such cases, and generalize your model according to this groups. Example of using:\n```python\n>>> dclua.DeclenseTrainer.generalizeModel(\n... model = [\n... [[\"\u043e\u043d\u0430\"], [\"\u043e\u043d\u0438\"], [\"\u043e\u043d\u0456\u0432\"]],\n... [[\"\u043e\u0432\u0430\"], [\"\u043e\u0432\u0438\"], [\"\u043e\u0432\u0456\u0432\"]],\n... ],\n... groups = [\n... [\"\u043d\", \"\u0432\", \"\u043f\", \"\u043c\"]\n... ],\n... threshold=.3\n... )\n...\n<<< [\n... [\n... [['\u043e\u043d\u0430'], ['\u043e\u043d\u0438'], ['\u043e\u043d\u0456\u0432']],\n... [['\u043e\u0432\u0430'], ['\u043e\u0432\u0438'], ['\u043e\u0432\u0456\u0432']],\n... [['\u043e\u043f\u0430'], ['\u043e\u043f\u0438'], ['\u043e\u043f\u0456\u0432']],\n... [['\u043e\u043c\u0430'], ['\u043e\u043c\u0438'], ['\u043e\u043c\u0456\u0432']]\n... ]\n... ]\n```\n\nThreshold parameter is a ratio between amount of rules, which can be generalized to some group and size of that group. It's equal to `.3` by default, so if there are less then `.3 * size_of_group` rules, they won't be generalized.\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/syntpump/declensor", "keywords": "nlp", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "dclua", "package_url": "https://pypi.org/project/dclua/", "platform": "", "project_url": "https://pypi.org/project/dclua/", "project_urls": { "Homepage": "https://github.com/syntpump/declensor", "Syntpump on GitHub": "https://github.com/syntpump" }, "release_url": "https://pypi.org/project/dclua/2.0/", "requires_dist": null, "requires_python": ">3", "summary": "Library for word declensions", "version": "2.0" }, "last_serial": 5312922, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "496f5c222c86edd51e8aa93c40708822", "sha256": "ef32116c7781fe273e177afdcbe890b0c48a95af354a14c879f11a1947ac66f0" }, "downloads": -1, "filename": "dclua-1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "496f5c222c86edd51e8aa93c40708822", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">3", "size": 3327, "upload_time": "2019-05-18T15:54:31", "url": "https://files.pythonhosted.org/packages/fc/3e/fcc68782a53ac58634e40a8a04cd004916d0f39fe0807673b88c0b2d5ce1/dclua-1.0-py2.py3-none-any.whl" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "740091118e2df2cc1ea4c65b44ea4b5e", "sha256": "85ad16aa364b70acf3e309e038785206be202c795d5e0774618e1661d3b14713" }, "downloads": -1, "filename": "dclua-1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "740091118e2df2cc1ea4c65b44ea4b5e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">3", "size": 6514, "upload_time": "2019-05-19T06:58:43", "url": "https://files.pythonhosted.org/packages/ab/aa/79d1f3b1462873afcf67e61d7c7d515fea8e85763b126610b45ac19a0fdf/dclua-1.1-py2.py3-none-any.whl" } ], "2.0": [ { "comment_text": "", "digests": { "md5": "13575a5af68cf1605665eeab67023612", "sha256": "595c022835a40e3e8461941b359d284634e5d3d524f5550e2a136337785518d8" }, "downloads": -1, "filename": "dclua-2.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "13575a5af68cf1605665eeab67023612", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">3", "size": 8821, "upload_time": "2019-05-24T14:04:40", "url": "https://files.pythonhosted.org/packages/76/fb/a82d5f3132c736b69d44c5d20d5713a2a1375f0099da673998c98f31d535/dclua-2.0-py2.py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "13575a5af68cf1605665eeab67023612", "sha256": "595c022835a40e3e8461941b359d284634e5d3d524f5550e2a136337785518d8" }, "downloads": -1, "filename": "dclua-2.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "13575a5af68cf1605665eeab67023612", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">3", "size": 8821, "upload_time": "2019-05-24T14:04:40", "url": "https://files.pythonhosted.org/packages/76/fb/a82d5f3132c736b69d44c5d20d5713a2a1375f0099da673998c98f31d535/dclua-2.0-py2.py3-none-any.whl" } ] }