{ "info": { "author": "Haebin Shin", "author_email": "sunsal0704@gmail.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Science/Research", "Natural Language :: Korean", "Operating System :: OS Independent", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Scientific/Engineering", "Topic :: Text Processing :: Linguistic" ], "description": "Jamotools\n=========\n\n|Build Status| |GitHub Tag| |PyPI version| |Python version| |License|\n\nA library for Korean Jamo split and vectorize.\n\nInstall\n-------\n\n.. code:: sh\n\n pip install jamotools\n\nUnicode of Korean\n-----------------\n\nAccording to the Version 9.0.0 database of the Unicode Consortium, the\nblocks specified in *Hangul* (Korean) in Unicode are as follows.\n\n- Hangul Jamo: 1100 ~ 11FF\n- WON SIGN in Currency Symbols: 20A9\n- HANGUL DOT TONE MARK in CJK Symbols and Punctuation: 302E ~ 302F\n- Hangul Compatibility Jamo : 3130 ~ 318F\n- Hangul in Enclosed CJK Letters and Months: 3200 ~ 321E, 3260 ~ 327F\n- Hangul Jamo Extended-A : A960 ~ A97F\n- Hangul Syllables : AC00 ~ D7AF\n- Hangul Jamo Extended-B : D7B0 ~ D7FF\n- Halfwidth Hangul variants in Halfwidth and Fullwidth Forms: FFA0 ~\n FFDC\n- FULLWIDTH WON SIGN in Halfwidth and Fullwidth Forms: FFE6\n\nJamo\n~~~~\n\nHangul is made of basic letters called *Jamo*. In unicode, Jamo is\ndefined by several kinds which contain old Hangul that does not use in\nnowadays. Jamotools only supports modern Hangul Jamo area as follows.\n\n- `Hangul Jamo `__: Consist of\n Choseong, Jungseong, Jongseong. It is divided mordern Hangul and old\n Hangul that does not use in nowadays. Jamotools supports modern\n Hangul Jamo area.\n\n - 1100 ~ 1112 (Choseong)\n - 1161 ~ 1175 (Jungseong)\n - 11A8 ~ 11C2 (Jongseong)\n\n- `Hangul Compatibility\n Jamo `__: It is a Korean\n Hangul language area that is compatible with the Hangul character\n standard (KS X 1001). It is not divided Choseong, Jungseong,\n Jongseong.\n\n - 3131 ~ 3163 (modern Hangul Jamo area)\n\n- `Halfwidth Hangul\n variants `__: This is the\n Korean half-width symbol area. Only modern Korean Jamo exists. The\n general Korean Hangul characterization method is the full-width.\n\n - FFA1 ~ FFDC\n\nManipulating Korean Jamo\n------------------------\n\nAPI for split syllables and join jamos to syllable is based on\n`hangul-utils `__.\n\n- ``split_syllables``: Converts a string of syllables to a string of\n jamos, can be select which convert unicode type.\n- ``join_jamos``: Converts a string of jamos to a string of syllables.\n- ``normalize_to_compat_jamo``: Normalize a string of jamos to a string\n of *Hangul Compatibility Jamo*.\n\n.. code:: py\n\n >>> import jamotools\n >>> print(jamotools.split_syllable_char(u\"\uc548\"))\n ('\u3147', '\u314f', '\u3134')\n\n >>> print(jamotools.split_syllables(u\"\uc548\ub155\ud558\uc138\uc694\"))\n \u3147\u314f\u3134\u3134\u3155\u3147\u314e\u314f\u3145\u3154\u3147\u315b\n\n >>> sentence = u\"\uc55e \uc9d1 \ud325\uc8fd\uc740 \ubd89\uc740 \ud325 \ud48b\ud325\uc8fd\uc774\uace0, \ub4b7\uc9d1 \ucf69\uc8fd\uc740 \ud587\ucf69 \ub2e8\ucf69 \ucf69\uc8fd.\uc6b0\ub9ac \uc9d1\n \uae68\uc8fd\uc740 \uac80\uc740 \uae68 \uae68\uc8fd\uc778\ub370 \uc0ac\ub78c\ub4e4\uc740 \ud587\ucf69 \ub2e8\ucf69 \ucf69\uc8fd \uae68\uc8fd \uc8fd\uba39\uae30\ub97c \uc2eb\uc5b4\ud558\ub354\ub77c.\"\n >>> s = jamotools.split_syllables(sentence)\n >>> print(s)\n \u3147\u314f\u314d \u3148\u3163\u3142 \u314d\u314f\u314c\u3148\u315c\u3131\u3147\u3161\u3134 \u3142\u315c\u313a\u3147\u3161\u3134 \u314d\u314f\u314c \u314d\u315c\u3145\u314d\u314f\u314c\u3148\u315c\u3131\u3147\u3163\u3131\u3157,\n \u3137\u315f\u3145\u3148\u3163\u3142 \u314b\u3157\u3147\u3148\u315c\u3131\u3147\u3161\u3134 \u314e\u3150\u3145\u314b\u3157\u3147 \u3137\u314f\u3134\u314b\u3157\u3147 \u314b\u3157\u3147\u3148\u315c\u3131.\u3147\u315c\u3139\u3163\n \u3148\u3163\u3142 \u3132\u3150\u3148\u315c\u3131\u3147\u3161\u3134 \u3131\u3153\u3141\u3147\u3161\u3134 \u3132\u3150 \u3132\u3150\u3148\u315c\u3131\u3147\u3163\u3134\u3137\u3154 \u3145\u314f\u3139\u314f\u3141\u3137\u3161\u3139\u3147\u3161\u3134\n \u314e\u3150\u3145\u314b\u3157\u3147 \u3137\u314f\u3134\u314b\u3157\u3147 \u314b\u3157\u3147\u3148\u315c\u3131 \u3132\u3150\u3148\u315c\u3131 \u3148\u315c\u3131\u3141\u3153\u3131\u3131\u3163\u3139\u3161\u3139\n \u3145\u3163\u3140\u3147\u3153\u314e\u314f\u3137\u3153\u3139\u314f.\n\n >>> sentence2 = jamotools.join_jamos(s)\n >>> print(sentence2)\n \uc55e \uc9d1 \ud325\uc8fd\uc740 \ubd89\uc740 \ud325 \ud48b\ud325\uc8fd\uc774\uace0, \ub4b7\uc9d1 \ucf69\uc8fd\uc740 \ud587\ucf69 \ub2e8\ucf69 \ucf69\uc8fd.\uc6b0\ub9ac \uc9d1 \uae68\uc8fd\uc740 \uac80\uc740 \uae68\n \uae68\uc8fd\uc778\ub370 \uc0ac\ub78c\ub4e4\uc740 \ud587\ucf69 \ub2e8\ucf69 \ucf69\uc8fd \uae68\uc8fd \uc8fd\uba39\uae30\ub97c \uc2eb\uc5b4\ud558\ub354\ub77c.\n\n >>> print(sentence == sentence2)\n True\n\nJamotools' API supports multiple unicode area of Hangul Jamo for\nmanipulating. Also consists of additional API for manipulating Korean\njamo.\n\n.. code:: py\n\n >>> sentence = u\"\uc790\ubaa8\"\n\n >>> jamos1 = jamotools.split_syllables(sentence, jamo_type=\"JAMO\")\n >>> print([hex(ord(c)) for c in jamos1])\n ['0x110C', '0x1161', '0x1106', '0x1169']\n >>> sentence1 = jamotools.join_jamos(jamos1)\n >>> print(sentence1)\n \uc548\ub155\ud558\uc138\uc694. hello 1\n\n >>> jamos2 = jamotools.split_syllables(sentence, jamo_type=\"COMPAT\")\n >>> print([hex(ord(c)) for c in jamos2])\n ['0x3148', '0x314F', '0x3141', '0x3157']\n >>> sentence2 = jamotools.join_jamos(jamos2)\n >>> print(sentence2)\n \uc548\ub155\ud558\uc138\uc694. hello 1\n\n >>> jamos3 = jamotools.split_syllables(sentence, jamo_type=\"HALFWIDTH\")\n >>> print([hex(ord(c)) for c in jamos3])\n ['0xFFB8', '0xFFC2', '0xFFB1', '0xFFCC']\n >>> sentence3 = jamotools.join_jamos(jamos3)\n >>> print(sentence3)\n \uc548\ub155\ud558\uc138\uc694. hello 1\n\n >>> print(sentence == sentence1 == sentence2 == sentence3)\n True\n\n >>> normalize1 = jamotools.normalize_to_compat_jamo(jamos1)\n >>> normalize2 = jamotools.normalize_to_compat_jamo(jamos2)\n >>> normalize3 = jamotools.normalize_to_compat_jamo(jamos3)\n >>> print(jamos1 == jamos2 == jamos3)\n False\n >>> print(normalize1 == normalize2 == normalize3)\n True\n\nVectorize Korean Jamo\n---------------------\n\nJamotools support vectorize function following RULE. Each RULE is\ndefined how split sentence to Jamo and convert which type of symbols. It\ncan be used character-level Korean text processing.\n\n- ``Vectorizationer``: Class for vectorize text by Rule and pad.\n\n.. code:: py\n\n >>> v = jamotools.Vectorizationer(rule=jamotools.rules.RULE_1, \\\n max_length=None, \\\n prefix_padding_size=0)\n >>> print(v.vectorize(u\"\uc548\ub155\"))\n [13, 21, 45, 4, 27, 62]\n\nCustom RULE\n~~~~~~~~~~~\n\nJamotools can add user's custom RULE class as following steps.\n\n1. Make custom RULE class which inherit RuleBase (e.g. Rule2) in\n `rules.py `__\n like Rule1.\n2. Add constant for custom RULE like\n `RULE\\_1 `__.\n3. Modify\n `get\\_rule `__\n function to return custom RULE class.\n\nThen it can be use as same as RULE\\_1 usage.\n\n.. code:: py\n\n >>> v = jamotools.Vectorizationer(rule=jamotools.rules.RULE_2, \\\n max_length=None, \\\n prefix_padding_size=0)\n\n.. |Build Status| image:: https://travis-ci.org/HaebinShin/jamotools.svg?branch=master\n :target: https://travis-ci.org/HaebinShin/jamotools\n.. |GitHub Tag| image:: https://img.shields.io/github/tag/HaebinShin/jamotools.svg?label=github+tag\n :target: https://github.com/HaebinShin/jamotools/tags\n.. |PyPI version| image:: https://img.shields.io/pypi/v/jamotools.svg\n :target: https://pypi.python.org/pypi/jamotools/\n.. |Python version| image:: https://img.shields.io/pypi/pyversions/jamotools.svg\n :target: https://pypi.python.org/pypi/jamotools/\n.. |License| image:: https://img.shields.io/pypi/l/jamotools.svg\n :target: https://github.com/HaebinShin/jamotools/blob/master/LICENSE\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/HaebinShin/jamotools", "keywords": "", "license": "GPL", "maintainer": "", "maintainer_email": "", "name": "jamotools", "package_url": "https://pypi.org/project/jamotools/", "platform": "", "project_url": "https://pypi.org/project/jamotools/", "project_urls": { "Homepage": "https://github.com/HaebinShin/jamotools" }, "release_url": "https://pypi.org/project/jamotools/0.1.10/", "requires_dist": [ "numpy", "six", "future" ], "requires_python": "", "summary": "A library for Korean Jamo split and vectorize.", "version": "0.1.10" }, "last_serial": 3690090, "releases": { "0.1.10": [ { "comment_text": "", "digests": { "md5": "09c9195d8151a7e4b4372d23c94f6f57", "sha256": "2974f7be1a18dfbb392f02a1dfff1900cae3e3fd27d4b56c1d973f148b40ef3f" }, "downloads": -1, "filename": "jamotools-0.1.10-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "09c9195d8151a7e4b4372d23c94f6f57", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 13645, "upload_time": "2018-03-21T04:33:46", "url": "https://files.pythonhosted.org/packages/3d/d6/ec13c68f7ea6a8085966390d256d183bf8488f8b9770028359acb86df643/jamotools-0.1.10-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2bc2ca8c3649352cbd5c9d146c62b82a", "sha256": "b3dc212ae4fc8dd63ec3ab666c98a6390ac5d28d0adba356ed89360c2d2fb115" }, "downloads": -1, "filename": "jamotools-0.1.10.tar.gz", "has_sig": false, "md5_digest": "2bc2ca8c3649352cbd5c9d146c62b82a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23938, "upload_time": "2018-03-21T04:33:48", "url": "https://files.pythonhosted.org/packages/29/bc/e1cf5a51b739eab1cc157b892052c491642fa54a0d938390340ef3180c91/jamotools-0.1.10.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "256aef6a98aad3bc30690d7da1aa9c8c", "sha256": "362809ade25ab73901325ed646bd556b532c050eb96bbda2fc1f56288a0bffca" }, "downloads": -1, "filename": "jamotools-0.1.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "256aef6a98aad3bc30690d7da1aa9c8c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 6780, "upload_time": "2018-03-19T17:37:51", "url": "https://files.pythonhosted.org/packages/33/de/592bb976b8e260c32ecd10dfd9bf81482e0acb997494bc403eb320c754e6/jamotools-0.1.3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "85fa17d2c815b9b9efbd60c5bb379035", "sha256": "14409596089c7f491be3abbc9f03d336e981eec667c368266b7021560e33c184" }, "downloads": -1, "filename": "jamotools-0.1.3.tar.gz", "has_sig": false, "md5_digest": "85fa17d2c815b9b9efbd60c5bb379035", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18743, "upload_time": "2018-03-19T17:37:52", "url": "https://files.pythonhosted.org/packages/e6/3b/1fe718e4730b5e7f0038f3d8589427f4d8113ae8c1ca4c0e36f83c5f1b98/jamotools-0.1.3.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "110b08cee2a2bff93825799c6e10650e", "sha256": "be35932e7b8c992ad0222440d4744a9d54a576dfde7c535295196d46cb13de2e" }, "downloads": -1, "filename": "jamotools-0.1.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "110b08cee2a2bff93825799c6e10650e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7087, "upload_time": "2018-03-19T18:41:40", "url": "https://files.pythonhosted.org/packages/2e/17/4fd4ac0f8be0f635a24aaa8579ac80b8b6edea1527b5b324e38db279e19f/jamotools-0.1.5-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1a7902a7738234e137e4c3d20070c865", "sha256": "c31e2a48fa630e26223591c0dd73b64fd8bc1f7bf6f7e816b1abb31d0190834f" }, "downloads": -1, "filename": "jamotools-0.1.5.tar.gz", "has_sig": false, "md5_digest": "1a7902a7738234e137e4c3d20070c865", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19012, "upload_time": "2018-03-19T18:41:42", "url": "https://files.pythonhosted.org/packages/e9/fe/e16d724766dfe52c0bc0e92e49f4640c6b0aeab988c02715d5cb48968466/jamotools-0.1.5.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "2e602ed66fd5d75f3784309a6010ca90", "sha256": "03811b57f4f1e8004bf78fe64bf273641e07743c9660aaefb87daee835dc795a" }, "downloads": -1, "filename": "jamotools-0.1.6-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "2e602ed66fd5d75f3784309a6010ca90", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7493, "upload_time": "2018-03-20T01:36:37", "url": "https://files.pythonhosted.org/packages/61/6a/e333a9713bff2cafaaee54cf51e5409ecae584645f3b727c9e4674283855/jamotools-0.1.6-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "25f1de1c3880a0d4662607248b9db303", "sha256": "1a14161ce32d8100b2019bea5e419c2790578989b41b63860211174c03d7b14d" }, "downloads": -1, "filename": "jamotools-0.1.6.tar.gz", "has_sig": false, "md5_digest": "25f1de1c3880a0d4662607248b9db303", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19445, "upload_time": "2018-03-20T01:36:38", "url": "https://files.pythonhosted.org/packages/b0/f2/a7d04e5a4f5a1797ae7b3b8d72d9920bf7a994c899ab686207be687bc17a/jamotools-0.1.6.tar.gz" } ], "0.1.7": [ { "comment_text": "", "digests": { "md5": "0a9e05b4e2cf81bd09f1060ae8d1552d", "sha256": "83617e240d28f96dc64c114d456926816091a72d63aae68b1d9037b0cfb5ac9f" }, "downloads": -1, "filename": "jamotools-0.1.7-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "0a9e05b4e2cf81bd09f1060ae8d1552d", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7499, "upload_time": "2018-03-20T02:14:20", "url": "https://files.pythonhosted.org/packages/82/a4/1768a2b2b7107a40cc521fb415b8d212b8d178ff5b8692c2e04ee7ab25bc/jamotools-0.1.7-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5e3ed9b1e8bf7db64e8eb02d03031d7c", "sha256": "2090123f9b2edd55094366a67693173e77f1af43b35ba775368a689ea22b99af" }, "downloads": -1, "filename": "jamotools-0.1.7.tar.gz", "has_sig": false, "md5_digest": "5e3ed9b1e8bf7db64e8eb02d03031d7c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19443, "upload_time": "2018-03-20T02:14:22", "url": "https://files.pythonhosted.org/packages/5c/b4/e890bb2c9e6f0c6c720375de5fbce4260fe73eb7f8380ec8fa1e15942dd7/jamotools-0.1.7.tar.gz" } ], "0.1.8": [ { "comment_text": "", "digests": { "md5": "d1c7adc7990e04d87c209379d4911bd7", "sha256": "d6afc46260a03bd61976f8bb35736167d6c99f244558088d302ae43d81b57e3f" }, "downloads": -1, "filename": "jamotools-0.1.8-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "d1c7adc7990e04d87c209379d4911bd7", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 13626, "upload_time": "2018-03-21T04:13:56", "url": "https://files.pythonhosted.org/packages/0e/85/525ee9c37b3d2dd7bcb52938f09ddb74a02cb30561fe35923f1a9608fc38/jamotools-0.1.8-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "15043753cbf9dcbcc921dfa9cc210027", "sha256": "e4f85dc56cc12c05cb964efde776f8652a0ac69435e116ade19a96246faff53a" }, "downloads": -1, "filename": "jamotools-0.1.8.tar.gz", "has_sig": false, "md5_digest": "15043753cbf9dcbcc921dfa9cc210027", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23939, "upload_time": "2018-03-21T04:13:57", "url": "https://files.pythonhosted.org/packages/76/f5/beffa04b6f042c27258643a9e5f140af10a727cde04fe86bd4f6eaddd4fb/jamotools-0.1.8.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "09c9195d8151a7e4b4372d23c94f6f57", "sha256": "2974f7be1a18dfbb392f02a1dfff1900cae3e3fd27d4b56c1d973f148b40ef3f" }, "downloads": -1, "filename": "jamotools-0.1.10-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "09c9195d8151a7e4b4372d23c94f6f57", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 13645, "upload_time": "2018-03-21T04:33:46", "url": "https://files.pythonhosted.org/packages/3d/d6/ec13c68f7ea6a8085966390d256d183bf8488f8b9770028359acb86df643/jamotools-0.1.10-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2bc2ca8c3649352cbd5c9d146c62b82a", "sha256": "b3dc212ae4fc8dd63ec3ab666c98a6390ac5d28d0adba356ed89360c2d2fb115" }, "downloads": -1, "filename": "jamotools-0.1.10.tar.gz", "has_sig": false, "md5_digest": "2bc2ca8c3649352cbd5c9d146c62b82a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23938, "upload_time": "2018-03-21T04:33:48", "url": "https://files.pythonhosted.org/packages/29/bc/e1cf5a51b739eab1cc157b892052c491642fa54a0d938390340ef3180c91/jamotools-0.1.10.tar.gz" } ] }