{ "info": { "author": "Johann-Mattis List", "author_email": "list@shh.mpg.de", "bugtrack_url": null, "classifiers": [], "description": "# SinoPy: Python Library for quantitative tasks in Chinese historical linguistics\n\n[![DOI](https://zenodo.org/badge/30593438.svg)](https://zenodo.org/badge/latestdoi/30593438)\n![PyPI](https://img.shields.io/pypi/v/sinopy.svg)\n\nSinoPy is an attempt to provide useful functionality for users working with Chinese dialects and Sino-Tibetan language data and struggling with tasks like converting characters to Pinyin, analysing characters, or analysing readings in Chinese dialects and other SEA languages. \n\nIf you use the library in your research, please quote it as:\n\n> List, Johann-Mattis (2018): SinoPy: Python Library for quantitative tasks in Chinese historical linguistics. Version 0.3.0. Jena: Max Planck Institute for the Science of Human History. DOI: https://zenodo.org/badge/latestdoi/30593438\n\nThis is intended as a plugin for LingPy, or an addon. The library gives utility functions that prove useful to handle Chinese data in a very broad context, ranging from Chinese character readings up to proposed readings in Middle Chinese and older stages of the language.\n\n## Quick Usage Examples\n\nConvert Baxter's (1992) Middle Chinese transcription system to plain IPA (with tone marks).\n\n```python\n>>> from sinopy import baxter2ipa\n>>> baxter2ipa('bjang')\n'bja\u014b\u00b9'\n>>> baxter2ipa('bjang', segmented=True)\n['b', 'j', 'a', '\u014b', '\u00b9']\n```\n\nConvert Chinese characters to P\u012bny\u012bn\n\n```python\n>>> from sinopy import pinyin\n>>> pinyin('\u6211', variant='cantonese')\n'ngo5'\n>>> pinyin('\u6211', variant='mandarin')\n'w\u01d2'\n```\n\nTry to find character by combining two characters:\n\n```python\n>>> from sinopy import character_from_structure\n>>> character_from_structure('+\u4eba\u6211')\n'\u4fc4'\n```\n\n## More examples\n\nAt the moment, you may have difficulties finding a common idea behind SinoPy,\nas the collection of scripts is very diverse. The general topic, however, are\nbasic operations one frequently encounters when working with Chinese and SEA\nlinguistic data.\n\nBut let's just look at a couple of examples:\n\n```python\n>>> from sinopy import *\n>>> char = \"\u6211\"\n>>> pinyin(char, variant=\"mandarin\")\nw\u01d2\n```\n\nSo obviously, we can convert characters to P\u012bny\u012bn.\n\n```python\n>>> is_chinese(char)\nTrue\n>>> is_chinese('b')\nFalse\n```\n\nSo the library also checks if a character belongs to Chinese Unicode range.\n\nBut we have also a range of functions for handling Middle Chinese and related problems. For example the following:\n\n```python\n>>> parse_baxter('ngaH')\n('ng', '', 'a', 'H')\n```\n\nSo this function will read in a Middle Chinese string (as encoded in the system of Baxter 1992) and return its main constituents (initial, medial, final, and tone).\n\nBut we can also directly convert a character to its Middle Chinese reading:\n\n```python\n>>> chars2baxter(char)\n['ngaX']\n```\n\nOr we can retrieve a basic gloss.\n\n```python\n>>> chars2gloss(char)\n['our, us, i, me, my, we']\n```\n\nA rather complex function is the `sixtuple2baxter` function, which reads in the classical six-character descriptions of the Middle Chinese reading of a given character and yields the Middle Chinese value following Baxter's system. You find a lot of sixtuple readings in the DOC database (published with the [Tower of Babel project](http://starling.rinet.ru/cgi-bin/response.cgi?root=config&morpho=0&basename=\\data\\china\\doc&first=1)).\n\n```python\n>>> sixtuple2baxter('\u87f9\u958b\u4e00\u4e0a\u6d77\u6ce5') \n['n', '', 'oj', 'X']\n>>> chars2baxter('\u4e43') \n['nojX']\n```\n\nYou can also directly try to retrieve the MC reading from passing two f\u01cenqi\u00e8 characters, for example:\n\n```python\n>>> fanqie2mch('\u6d77\u6ce5')\n'xej'\n>>> fanqie2mch('\u6ce5\u6d77')\n'nojX'\n```\n\nAnd if you don't like Baxter's MCH transcriptions, you can simply turn it to IPA:\n\n```python\n>>> baxter2ipa('nojX')\nnoj\u00b2\n>>> baxter2ipa('tsyang')\n'\u02a8a\u014b\u00b9'\n```\n\nAs a final important function, consider the parser for morphemes:\n\n```python\n>>> parse_chinese_morphemes('\u02a8a\u014b\u00b9')\n['\u02a8', '-', 'a', '\u014b', '\u00b9']\n```\n\nThe quintuple that he method returns splits the sequence into its five main constituents, initial, medial, nucleus, coda, and tone.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/lingpy/sinopy", "keywords": "Chinese linguistics,historical linguistics,computer-assisted language comparison", "license": "GPL", "maintainer": "", "maintainer_email": "", "name": "sinopy", "package_url": "https://pypi.org/project/sinopy/", "platform": "", "project_url": "https://pypi.org/project/sinopy/", "project_urls": { "Homepage": "https://github.com/lingpy/sinopy" }, "release_url": "https://pypi.org/project/sinopy/0.3.3/", "requires_dist": null, "requires_python": "", "summary": "A Python library for quantitative tasks in Chinese historical linguistics.", "version": "0.3.3" }, "last_serial": 4529163, "releases": { "0.3.0": [ { "comment_text": "", "digests": { "md5": "6e3ac72dbb66a90fd217a5c22d42590b", "sha256": "40ec83844d96494903bc09c187ae5743385f12387ed7be465797a445e83de1ba" }, "downloads": -1, "filename": "sinopy-0.3.0.tar.gz", "has_sig": false, "md5_digest": "6e3ac72dbb66a90fd217a5c22d42590b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24403, "upload_time": "2018-08-24T10:53:33", "url": "https://files.pythonhosted.org/packages/66/aa/0207b082cf1789eb8654dc0763cddcfe1d55b96d79299af0bb7b74b11905/sinopy-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "5d8d450e84d9e2bcb1c3df54e0638102", "sha256": "fb8cb22f08facd19fac77877874d43604c52854dbc592585a35928488ecd805c" }, "downloads": -1, "filename": "sinopy-0.3.1.tar.gz", "has_sig": false, "md5_digest": "5d8d450e84d9e2bcb1c3df54e0638102", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9714486, "upload_time": "2018-08-24T12:04:06", "url": "https://files.pythonhosted.org/packages/2a/04/ce6261d2fdae3f3ba48153b8c90ae8003ffcf3342a3218adaf30bf416347/sinopy-0.3.1.tar.gz" } ], "0.3.2": [ { "comment_text": "", "digests": { "md5": "8bdb56c7565fffc73c2d47cca2e4abd1", "sha256": "ed32a06522799cc5126f41dd2e6354fcdaf15eec8751783590db52d96771b1e3" }, "downloads": -1, "filename": "sinopy-0.3.2.tar.gz", "has_sig": false, "md5_digest": "8bdb56c7565fffc73c2d47cca2e4abd1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10053973, "upload_time": "2018-11-25T13:18:27", "url": "https://files.pythonhosted.org/packages/ae/e1/bb3c6a936816ccd4d4e3ae3d9763baf8ed6693ae11e8bc3d9db800a7b878/sinopy-0.3.2.tar.gz" } ], "0.3.3": [ { "comment_text": "", "digests": { "md5": "dad4cf8f9757838d419383ac71259ab8", "sha256": "6a771b82fa5e496299a495d3e77c758227803c235eafc31536c0ba295785400a" }, "downloads": -1, "filename": "sinopy-0.3.3.tar.gz", "has_sig": false, "md5_digest": "dad4cf8f9757838d419383ac71259ab8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10053231, "upload_time": "2018-11-26T10:50:47", "url": "https://files.pythonhosted.org/packages/e1/cd/a53af0c0900563085d046c06a7dd0c92303533084e31510cec7ffd61cc3b/sinopy-0.3.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "dad4cf8f9757838d419383ac71259ab8", "sha256": "6a771b82fa5e496299a495d3e77c758227803c235eafc31536c0ba295785400a" }, "downloads": -1, "filename": "sinopy-0.3.3.tar.gz", "has_sig": false, "md5_digest": "dad4cf8f9757838d419383ac71259ab8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10053231, "upload_time": "2018-11-26T10:50:47", "url": "https://files.pythonhosted.org/packages/e1/cd/a53af0c0900563085d046c06a7dd0c92303533084e31510cec7ffd61cc3b/sinopy-0.3.3.tar.gz" } ] }