{ "info": { "author": "Herman Schaaf", "author_email": "herman@ironzebra.com", "bugtrack_url": null, "classifiers": [], "description": "[![Build Status](https://travis-ci.org/hermanschaaf/mafan.svg?branch=master)](https://travis-ci.org/hermanschaaf/mafan)\n\n===========\nMafan - Toolkit for working with Chinese in Python\n===========\n\nMafan is a collection of Python tools for making your life working with Chinese so much less \u9ebb\u70e6 (mafan, i.e. troublesome). \n\nContained in here is an ever-growing collection of loosely-related tools, broken down into several files. These are:\n\ninstallation\n===========\n\nInstall through pip:\n\n pip install mafan\n\nencodings\n===========\n\n`encodings` contains functions for converting files from any number of \u9ebb\u70e6 character encodings to something more sane (utf-8, by default). For example:\n\n```python\nfrom mafan import encoding\n\nfilename = 'ugly_big5.txt' # name or path of file as string\nencoding.convert(filename) # creates a file with name 'ugly_big5_utf-8.txt' in glorious utf-8 encoding\n```\n\n\ntext\n===========\n\n`text` contains some functions for working with strings. Things like detecting english in a string, whether a string has Chinese punctuation, etc. Check out `text.py` for all the latest goodness. It also contains a handy wrapper for the jianfan package for converting between simplified and traditional:\n\n```python\n>>> from mafan import simplify, tradify\n>>> string = u'\u8fd9\u662f\u9ebb\u70e6\u5566'\n>>> print tradify(string) # convert string to traditional\n\u9019\u662f\u9ebb\u7169\u5566\n>>> print simplify(tradify(string)) # convert back to simplified\n\u8fd9\u662f\u9ebb\u70e6\u5566\n```\n\nThe `has_punctuation` and `contains_latin` functions are useful for knowing whether you are really dealing with Chinese, or Chinese characters:\n\n```python\n>>> from mafan import text\n>>> text.has_punctuation(u'\u8fd9\u662f\u9ebb\u70e6\u5566') # check for any Chinese punctuation (full-stops, commas, quotation marks, etc)\nFalse\n>>> text.has_punctuation(u'\u8fd9\u662f\u9ebb\u70e6\u5566.')\nFalse\n>>> text.has_punctuation(u'\u8fd9\u662f\u9ebb\u70e6\u5566\u3002')\nTrue\n>>> text.contains_latin(u'\u8fd9\u662f\u9ebb\u70e6\u5566\u3002')\nFalse\n>>> text.contains_latin(u'You are\u9ebb\u70e6\u5566\u3002')\nTrue\n```\n\nYou can also test whether sentences or documents use simplified characters, traditional characters, both or neither:\n\n```python\n>>> import mafan\n>>> from mafan import text\n>>> text.is_simplified(u'\u8fd9\u662f\u9ebb\u70e6\u5566')\nTrue\n>>> text.is_traditional(u'Hello,\u9019\u662f\u9ebb\u7169\u5566') # ignores non-chinese characters\nTrue\n\n# Or done another way:\n>>> text.identify(u'\u8fd9\u662f\u9ebb\u70e6\u5566') is mafan.SIMPLIFIED\nTrue\n>>> text.identify(u'\u9019\u662f\u9ebb\u7169\u5566') is mafan.TRADITIONAL\nTrue\n>>> text.identify(u'\u8fd9\u662f\u9ebb\u70e6\u5566! \u9019\u662f\u9ebb\u7169\u5566') is mafan.BOTH\nTrue\n>>> text.identify(u'This is so mafan.') is mafan.NEITHER # or None\nTrue\n```\n\nThe identification functionality is introduced as a very thin wrapper to Thomas Roten's [hanzidentifier](https://github.com/tsroten/hanzidentifier), which is included as part of mafan.\n\nAnother function that comes pre-built into Mafan is `split_text`, which tokenizes Chinese sentences into words:\n\n```python\n>>> from mafan import split_text\n>>> split_text(u\"\u9019\u662f\u9ebb\u7169\u5566\")\n[u'\\u9019', u'\\u662f', u'\\u9ebb\\u7169', u'\\u5566']\n>>> print ' '.join(split_text(u\"\u9019\u662f\u9ebb\u7169\u5566\"))\n\u9019 \u662f \u9ebb\u7169 \u5566\n```\n\nYou can also optionally pass the boolean `include_part_of_speech` parameter to get tagged words back:\n\n```python\n>>> split_text(u\"\u9019\u662f\u9ebb\u7169\u5566\", include_part_of_speech=True)\n[(u'\\u9019', 'r'), (u'\\u662f', 'v'), (u'\\u9ebb\\u7169', 'x'), (u'\\u5566', 'y')]\n```\n\npinyin\n===========\n\n`pinyin` contains functions for working with or converting between pinyin. At the moment, the only function in there is one to convert numbered pinyin to the pinyin with correct tone marks. For example:\n\n```python\n>>> from mafan import pinyin\n>>> print pinyin.decode(\"ni3hao3\")\nn\u01d0h\u01ceo\n```\n\ntraditional characters\n===========\n\nIf you want to be able to use `split_text` on traditional characters, you can make use of one of two options: \n\n - Either set an environment variable, `MAFAN_DICTIONARY_PATH`, to the absolute path to a local copy of this [dictionary file](https://github.com/fxsjy/jieba/raw/master/extra_dict/dict.txt.big),\n - or install the `mafan_traditional` convenience package: `pip install mafan_traditional`. If this package is installed and available, mafan will default to use this extended dictionary file. \n\nContributors:\n-----------\n * Herman Schaaf ([IronZebra.com](http://www.ironzebra.com)) (Author)\n * Thomas Roten ([Github](https://github.com/tsroten/))\n * [JOEWONGLVFS](https://github.com/JOEWONGLVFS)\n * Casper CY Chiang ([Github](https://github.com/cychiang))\n\nAny contributions are very welcome! \n\n\nSites using this:\n-----------\n * [ChineseLevel.com](http://www.ChineseLevel.com)", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/hermanschaaf/mafan", "keywords": null, "license": "LICENSE.txt", "maintainer": null, "maintainer_email": null, "name": "mafan", "package_url": "https://pypi.org/project/mafan/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/mafan/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/hermanschaaf/mafan" }, "release_url": "https://pypi.org/project/mafan/0.3.1/", "requires_dist": null, "requires_python": null, "summary": "A toolbox for working with the Chinese language in Python", "version": "0.3.1" }, "last_serial": 2452565, "releases": { "0.1.0": [], "0.1.1": [ { "comment_text": "", "digests": { "md5": "274ac483451d896a4f4b00097cf3c41f", "sha256": "d12181128f9fe60c435732ba63a62476d2e9c7c059fba3a79be58636462889e1" }, "downloads": -1, "filename": "mafan-0.1.1.tar.gz", "has_sig": false, "md5_digest": "274ac483451d896a4f4b00097cf3c41f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4703, "upload_time": "2013-02-14T14:41:57", "url": "https://files.pythonhosted.org/packages/48/07/67dfdf6da498187ed92ccff11aeeb8b84acdd9ea2860492332e8d7ddbf48/mafan-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "21b939b9bd56362c2bb2321c9a311acb", "sha256": "a1acca671464b18edc79ec14e78e0e9d9eb976877e50b05f3fb5caf147af20f5" }, "downloads": -1, "filename": "mafan-0.1.2.tar.gz", "has_sig": false, "md5_digest": "21b939b9bd56362c2bb2321c9a311acb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4966, "upload_time": "2013-02-14T14:58:37", "url": "https://files.pythonhosted.org/packages/0d/71/384d0442259c67f353957d053da51dde006fbe6d6a3c20b3d37d80f75781/mafan-0.1.2.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "7da40ef2443097724a2f681bd621b119", "sha256": "a142c2176e9f50f2937c88f58729a68b9e362c8f72fedca90e91b2738f188d2a" }, "downloads": -1, "filename": "mafan-0.1.4.tar.gz", "has_sig": false, "md5_digest": "7da40ef2443097724a2f681bd621b119", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4995, "upload_time": "2013-02-14T15:06:27", "url": "https://files.pythonhosted.org/packages/8f/51/47bc432f4d676cce30bc61fe8ac5fedb5dfe9df53c287cb2532a98647e90/mafan-0.1.4.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "1e9e87e45fd3ccc404b03760f2da5d69", "sha256": "c479e03bbec23913deda914efcd860a59dc69ee0a032e2f1f04c9f40b194fece" }, "downloads": -1, "filename": "mafan-0.2.0.tar.gz", "has_sig": false, "md5_digest": "1e9e87e45fd3ccc404b03760f2da5d69", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 51808, "upload_time": "2013-04-27T12:02:53", "url": "https://files.pythonhosted.org/packages/26/e9/b40920a4a39f13908966a82d0ecdc693750d8c819a14e0ef2c0895cb7cde/mafan-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "3874736aa5902886733204e35ef7a0a8", "sha256": "1ebd034628d614951834e0b418e633f39a9101033106ebfcd71fb1cefc41e345" }, "downloads": -1, "filename": "mafan-0.2.1.tar.gz", "has_sig": false, "md5_digest": "3874736aa5902886733204e35ef7a0a8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 51837, "upload_time": "2013-06-08T08:56:19", "url": "https://files.pythonhosted.org/packages/24/5f/15de3123e3861d0a148318b684ff28863d4ffbb28c45681b178871fa5f5a/mafan-0.2.1.tar.gz" } ], "0.2.10": [ { "comment_text": "", "digests": { "md5": "81cae43f10ba3ff67061390d8fd106f4", "sha256": "b9a04193497a1796dc34be0455828d08fb0bfbfe39647ef033caebe5ad64ac8c" }, "downloads": -1, "filename": "mafan-0.2.10.tar.gz", "has_sig": false, "md5_digest": "81cae43f10ba3ff67061390d8fd106f4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 69511, "upload_time": "2015-07-30T03:09:31", "url": "https://files.pythonhosted.org/packages/b5/68/b2f0cef8ed9d1ee49f0fe03777413007969d123cd6523afca4a4bfcdcf0d/mafan-0.2.10.tar.gz" } ], "0.2.11": [ { "comment_text": "", "digests": { "md5": "eb8b10ed1706c830709eba314355f8ee", "sha256": "208cdd79d59be78c9b3f709a63bae25ec95b20461109bad5cd7bab43b5c23f4a" }, "downloads": -1, "filename": "mafan-0.2.11.tar.gz", "has_sig": false, "md5_digest": "eb8b10ed1706c830709eba314355f8ee", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 69998, "upload_time": "2016-11-07T15:21:23", "url": "https://files.pythonhosted.org/packages/3b/d1/5f2960d13433bd4e76fd7c490ff0e3097f6a0e8d4b9dcaa6702a5c563882/mafan-0.2.11.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "489b50da5793d92d2bad37c96cfcf46f", "sha256": "5fba9e433639e80de542d755601d2e8c15e664818ec72c111a106a5e7faa6b23" }, "downloads": -1, "filename": "mafan-0.2.2.tar.gz", "has_sig": false, "md5_digest": "489b50da5793d92d2bad37c96cfcf46f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 51863, "upload_time": "2013-06-09T08:49:17", "url": "https://files.pythonhosted.org/packages/b2/60/53d450c8c3582c27c1ce9037be6692fa73d1f394e5eb8ec5bbfdca2d3839/mafan-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "ae1657e9e23c68bb3c2fbdb19832b42b", "sha256": "aa4b1bf18439ba4aeedb941f4b5aaceae3ca31c17392c22ccb7c0c08e5f2dfed" }, "downloads": -1, "filename": "mafan-0.2.3.tar.gz", "has_sig": false, "md5_digest": "ae1657e9e23c68bb3c2fbdb19832b42b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 53969, "upload_time": "2013-06-09T13:06:35", "url": "https://files.pythonhosted.org/packages/84/87/7413081db3f42e996f515860e986c7951701c18665c6142a33588d4a90f9/mafan-0.2.3.tar.gz" } ], "0.2.4": [ { "comment_text": "", "digests": { "md5": "e648ed06c8265bf65b46f7d5b7b2c103", "sha256": "e11802dfd97e3f77eb38d148a047bebcd8bc2e5a0f46a6e14f9d4de0e7a6823d" }, "downloads": -1, "filename": "mafan-0.2.4.tar.gz", "has_sig": false, "md5_digest": "e648ed06c8265bf65b46f7d5b7b2c103", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54284, "upload_time": "2014-01-26T09:09:34", "url": "https://files.pythonhosted.org/packages/76/67/7bdbf4731ee824dd89ab82723d89e2701f8091ee5366fe3bbff26bf3dfe7/mafan-0.2.4.tar.gz" } ], "0.2.5": [ { "comment_text": "", "digests": { "md5": "4aecc2b6c1ea210459b1ba3ac909fb3a", "sha256": "7bfbd623d2399dcf851e6ff4c793eab8caecc2fc6e23607a061bf4c7f5edfa1a" }, "downloads": -1, "filename": "mafan-0.2.5.tar.gz", "has_sig": false, "md5_digest": "4aecc2b6c1ea210459b1ba3ac909fb3a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54294, "upload_time": "2014-01-26T09:11:59", "url": "https://files.pythonhosted.org/packages/12/e5/dd0323cbf334e706c0909f321f30815168d6d1f8cb7ff7c52c9797dec8d0/mafan-0.2.5.tar.gz" } ], "0.2.6": [ { "comment_text": "", "digests": { "md5": "a5ccdfdaee8ccc0d9bc92290913cb277", "sha256": "82bd2091f3217bd41653bef656f2ad8e61bec780018056212a587cc9bc69a737" }, "downloads": -1, "filename": "mafan-0.2.6.tar.gz", "has_sig": false, "md5_digest": "a5ccdfdaee8ccc0d9bc92290913cb277", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54526, "upload_time": "2014-11-03T09:10:50", "url": "https://files.pythonhosted.org/packages/d2/e5/a743068194b1bb90d94e0e4ab82da208b062e4862826a322f64143ca5d78/mafan-0.2.6.tar.gz" } ], "0.2.7": [ { "comment_text": "", "digests": { "md5": "072bfed7a32069f3c2d758deeb6331f3", "sha256": "5c6543922e158f27545d3643189e20e02c2776549e35f9e9d8dff669e298e152" }, "downloads": -1, "filename": "mafan-0.2.7.tar.gz", "has_sig": false, "md5_digest": "072bfed7a32069f3c2d758deeb6331f3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54634, "upload_time": "2015-07-30T02:34:05", "url": "https://files.pythonhosted.org/packages/35/94/174cfee8566a7d21c703cca1d11a5bbd5ceaf793c70b95ce16a9da65cad1/mafan-0.2.7.tar.gz" } ], "0.2.8": [ { "comment_text": "", "digests": { "md5": "2b68a3066109bdb71afaf217f75ca5ad", "sha256": "628722881a63208f0b79aa9ec28f54fc6fa65874983e4fbb231c2bcb2983566b" }, "downloads": -1, "filename": "mafan-0.2.8.tar.gz", "has_sig": false, "md5_digest": "2b68a3066109bdb71afaf217f75ca5ad", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54575, "upload_time": "2015-07-30T02:42:37", "url": "https://files.pythonhosted.org/packages/58/35/5e952e792619a25a3e9a57a541f21160c73919bfeb6d50f6a94872a42745/mafan-0.2.8.tar.gz" } ], "0.2.9": [ { "comment_text": "", "digests": { "md5": "e54e94b8587b9643b9f4ea328b884033", "sha256": "c265fdedfad1fd8a2d1f361b2af3e0a17d63f231142e1cbc084470d0402daac5" }, "downloads": -1, "filename": "mafan-0.2.9.tar.gz", "has_sig": false, "md5_digest": "e54e94b8587b9643b9f4ea328b884033", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54628, "upload_time": "2015-07-30T03:05:06", "url": "https://files.pythonhosted.org/packages/68/16/a7df20ef4b3e23edc80cb44d120101bce2a1cdc3e492dfea2b4bfb113414/mafan-0.2.9.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "c8b59a0247ae47b8f54f9d3374054faa", "sha256": "82aeba917b587da99bba7e30e01f0c3bd4217eb4ad1596eb0c99a7d80603709c" }, "downloads": -1, "filename": "mafan-0.3.0.tar.gz", "has_sig": false, "md5_digest": "c8b59a0247ae47b8f54f9d3374054faa", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 70232, "upload_time": "2016-11-07T16:36:13", "url": "https://files.pythonhosted.org/packages/cf/30/3fbc67ffde6f4422e0f2f29f1c19107bd253c6b4c2966734a7e56f62152d/mafan-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "b0d18c9fd56b58e31e38a07cf4b33c4d", "sha256": "943433e662981776c50d3873fc82878c2bdca024347f2b9bee37d3d48bd69b60" }, "downloads": -1, "filename": "mafan-0.3.1.tar.gz", "has_sig": false, "md5_digest": "b0d18c9fd56b58e31e38a07cf4b33c4d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 70218, "upload_time": "2016-11-10T08:18:49", "url": "https://files.pythonhosted.org/packages/44/43/8a3dfdc1feff7eaab529490f3559361c3f75ce0e8ba2d9195fefabade86e/mafan-0.3.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b0d18c9fd56b58e31e38a07cf4b33c4d", "sha256": "943433e662981776c50d3873fc82878c2bdca024347f2b9bee37d3d48bd69b60" }, "downloads": -1, "filename": "mafan-0.3.1.tar.gz", "has_sig": false, "md5_digest": "b0d18c9fd56b58e31e38a07cf4b33c4d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 70218, "upload_time": "2016-11-10T08:18:49", "url": "https://files.pythonhosted.org/packages/44/43/8a3dfdc1feff7eaab529490f3559361c3f75ce0e8ba2d9195fefabade86e/mafan-0.3.1.tar.gz" } ] }