{ "info": { "author": "Aleksas Pielikis", "author_email": "ant.kampo@gmail.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Natural Language :: English", "Programming Language :: Python", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4" ], "description": "[![Build status](https://ci.appveyor.com/api/projects/status/pd61vbwpawr3yejs?svg=true)](https://ci.appveyor.com/project/aleksas/phonology-engine)\n[![PyPI](https://img.shields.io/pypi/v/phonology_engine.svg)](https://pypi.org/project/phonology-engine)\n\n# About\n\nAt the core of this library is text normalization and word stressing processor from [LIEPA speach synthesizer](https://www.ra\u0161tija.lt/liepa). The native code related to text processing was cut out of [the synthesizer library code](https://www.ra\u0161tija.lt/liepa/infrastrukturines-paslaugos/elektroninio-teksto-skaitytuvas/7563) and wrapped in Python.\n\n# License\n\n- [BSD liscense](https://raw.githubusercontent.com/aleksas/phonology_engine/master/LICENSE)\n\n# Intro\n\nThe library takes text in Lithuanian and does following:\n- Normalizes it. Converts numbers to word reprezentations (e.g. \"1\" > \"vienas\").\n- Splits text into phrases/sentences.\n- Splits phrases into words\n- Identifies word syllables\n- Identifies possible grammar forms of the word, and identifies stressed letter and stress type according the grammar form\n- Chooses one rule\n- Returns either structured results or collapsed \n\nLibrary supports following environments:\n- Python: 2.7, 3.*\n- OS: Linux, Windows\n- Architecture: 32bit, 64bit\n\n# Installing\n\n```\npip install phonology_engine\n```\n\n# Using \n\n## Normalize text\nConversion from numbers to word representation.\n\n```\nfrom phonology_engine import PhonologyEngine\npe = PhonologyEngine()\nres = pe.normalize_and_collapse('31 ka\u010diukas perb\u0117go keli\u0105.')\nprint(res)\n```\nWould result in \n```\nTRISDE\u0160IMT VIENAS KA\u010cIUKAS PERB\u0116GO KELI\u0104.\n```\n\n## Process\nDetermining word stresses.\n\n```\nfrom phonology_engine import PhonologyEngine\npe = PhonologyEngine()\nres = pe.process_and_collapse('31 ka\u010diukas perb\u0117go keli\u0105.', 'utf8_stressed_word')\nprint(res)\n```\nWould result in \n```\nTRI\u0300SDE\u0160IMT VI\u0301ENAS KA\u010cIU\u0300KAS PE\u0301RB\u0116GO KE\u0303LI\u0104.\n```\n------\n\nDetermining word stresses, syllables, grammar form from word.\n\n```\nfrom phonology_engine import PhonologyEngine\nfrom pprint import pprint\npe = PhonologyEngine()\nres = pe.process('31 ka\u010diukas perb\u0117go keli\u0105.', include_syllables=True)\npprint(res)\n```\nWould result in \n```\n('.',\n [('',\n [[{'ascii_stressed_word': 'TRI`-SDE-\u0160IMT',\n 'number_stressed_word': 'TRI0-SDE-\u0160IMT',\n 'stress_options': {'decoded_options': [{'rule': 'Nekaitomas \u017eodis'}],\n 'options': [(2, 0, 1, 1688)],\n 'selected_index': 0},\n 'syllables': [0, 3, 6],\n 'utf8_stressed_word': 'TRI\u0300-SDE-\u0160IMT',\n 'word': 'TRI-SDE-\u0160IMT'},\n {'ascii_stressed_word': 'VI^E-NAS',\n 'number_stressed_word': 'VI1E-NAS',\n 'stress_options': {'decoded_options': [{'grammatical_case': 'Vardininkas',\n 'number': 'vienaskaita',\n 'rule': 'Linksnis ir kamieno '\n 'tipas',\n 'stem_type': 16,\n 'stress_type': 1,\n 'stressed_letter_index': 1}],\n 'options': [(1, 1, 2, 4096)],\n 'selected_index': 0},\n 'syllables': [0, 3],\n 'utf8_stressed_word': 'VI\u0301E-NAS',\n 'word': 'VIE-NAS'},\n {'ascii_stressed_word': 'KA-\u010cIU`-KAS',\n 'number_stressed_word': 'KA-\u010cIU0-KAS',\n 'stress_options': {'decoded_options': [{'grammatical_case': 'Vardininkas',\n 'number': 'vienaskaita',\n 'rule': 'Linksnis ir kamieno '\n 'tipas',\n 'stem_type': 0,\n 'stress_type': 0,\n 'stressed_letter_index': 4}],\n 'options': [(4, 0, 2, 0)],\n 'selected_index': 0},\n 'syllables': [0, 2, 5],\n 'utf8_stressed_word': 'KA-\u010cIU\u0300-KAS',\n 'word': 'KA-\u010cIU-KAS'},\n {'ascii_stressed_word': 'PE^R-B\u0116-GO',\n 'number_stressed_word': 'PE1R-B\u0116-GO',\n 'stress_options': {'decoded_options': [{'rule': 'Veiksmazod\u017ei\u0173 kamienas '\n 'ir galune (taisytina)'}],\n 'options': [(1, 1, 0, 465)],\n 'selected_index': 0},\n 'syllables': [0, 3, 5],\n 'utf8_stressed_word': 'PE\u0301R-B\u0116-GO',\n 'word': 'PER-B\u0116-GO'},\n {'ascii_stressed_word': 'KE~-LI\u0104',\n 'number_stressed_word': 'KE2-LI\u0104',\n 'stress_options': {'decoded_options': [{'grammatical_case': 'Galininkas',\n 'number': 'vienaskaita',\n 'rule': 'Linksnis ir kamieno '\n 'tipas',\n 'stem_type': 2,\n 'stress_type': 2,\n 'stressed_letter_index': 1}],\n 'options': [(1, 2, 2, 515)],\n 'selected_index': 0},\n 'syllables': [0, 2],\n 'utf8_stressed_word': 'KE\u0303-LI\u0104',\n 'word': 'KE-LI\u0104'}]],\n ['TRISDE\u0160IMT VIENAS KA\u010cIUKAS PERB\u0116GO KELI\u0104']),\n ''])\n```\n\n# References\n- [Kir\u010diavimas internetu](http://kirtis.info) - Online dictionarry with word stresses and grammar annotation, has a [GitHub repo](https://github.com/Sistemium/krc-angular). It is likely based on [VDU dictionary](https://github.com/aleksas/phonology_engine/tree/resources/VDU). \n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/aleksas/phonology_engine", "keywords": "phonology_engine,phonology,pronunciation,stress,syllable,accent,hyphenation", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "phonology-engine", "package_url": "https://pypi.org/project/phonology-engine/", "platform": "", "project_url": "https://pypi.org/project/phonology-engine/", "project_urls": { "Homepage": "https://github.com/aleksas/phonology_engine" }, "release_url": "https://pypi.org/project/phonology-engine/0.1.14/", "requires_dist": null, "requires_python": "", "summary": "Module to get stress and syllables for words in a given sentence in Lithuanian language.", "version": "0.1.14" }, "last_serial": 4644668, "releases": { "0.1.10": [ { "comment_text": "", "digests": { "md5": "052c545a4ee3598d7757774fdabf3863", "sha256": "2b8fdc73cfc97d05510c65fb748ebfe13571cdbb4d23cb8ef21ad822a5ed60a0" }, "downloads": -1, "filename": "phonology_engine-0.1.10-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "052c545a4ee3598d7757774fdabf3863", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 2756596, "upload_time": "2018-10-17T11:51:13", "url": "https://files.pythonhosted.org/packages/7f/aa/d1300db06b0ee18909d36fff0ab34228e9a0e4fe55d5a82ac3242a6f269f/phonology_engine-0.1.10-py2.py3-none-any.whl" } ], "0.1.12": [ { "comment_text": "", "digests": { "md5": "625c0da21f34f8e5e1b03bb4681389e6", "sha256": "f9b11ba32a011389f61e1aacad898bed47a07f0cba08da27c344ff2e65599f51" }, "downloads": -1, "filename": "phonology_engine-0.1.12-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "625c0da21f34f8e5e1b03bb4681389e6", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 2757320, "upload_time": "2018-10-19T19:47:57", "url": "https://files.pythonhosted.org/packages/e9/94/71ecc8418b7915624401e88934cbd155cd3395790071d73226399f542177/phonology_engine-0.1.12-py2.py3-none-any.whl" } ], "0.1.13": [ { "comment_text": "", "digests": { "md5": "576054e9cfc3f9fec82ef99c9586c411", "sha256": "65128fd1702ac0bea39538414ce7f4c36af7252d02698aaee1d61b7d279faea9" }, "downloads": -1, "filename": "phonology_engine-0.1.13-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "576054e9cfc3f9fec82ef99c9586c411", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 2757339, "upload_time": "2018-10-20T12:01:16", "url": "https://files.pythonhosted.org/packages/ec/66/44029b9c09f585c15333afd9cf7d18fb7e257d26506147e9ae411df4d014/phonology_engine-0.1.13-py2.py3-none-any.whl" } ], "0.1.14": [ { "comment_text": "", "digests": { "md5": "7ffb92e71bdff66f7fbbabfbc5d2d568", "sha256": "aae3874faffe791081fb10e5948c5268403eed1d5fea3fb6a2d10aab57ac72f7" }, "downloads": -1, "filename": "phonology_engine-0.1.14-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "7ffb92e71bdff66f7fbbabfbc5d2d568", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 2757588, "upload_time": "2018-12-29T20:38:08", "url": "https://files.pythonhosted.org/packages/e2/41/5a8a352ed7f993a18cf60623742055966fa05de4fb35d3ce66a79ce2cb9a/phonology_engine-0.1.14-py2.py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7ffb92e71bdff66f7fbbabfbc5d2d568", "sha256": "aae3874faffe791081fb10e5948c5268403eed1d5fea3fb6a2d10aab57ac72f7" }, "downloads": -1, "filename": "phonology_engine-0.1.14-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "7ffb92e71bdff66f7fbbabfbc5d2d568", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 2757588, "upload_time": "2018-12-29T20:38:08", "url": "https://files.pythonhosted.org/packages/e2/41/5a8a352ed7f993a18cf60623742055966fa05de4fb35d3ce66a79ce2cb9a/phonology_engine-0.1.14-py2.py3-none-any.whl" } ] }