{ "info": { "author": "Narongdej Sarnsuwan", "author_email": "narongdej@sarnsuwan.com", "bugtrack_url": null, "classifiers": [], "description": "# pythaiwordcut - Thai Wordcut in Python\n\n[![Codacy Badge](https://api.codacy.com/project/badge/Grade/c4cb39daa5a54ffd9c1a797072e0f6d2)](https://www.codacy.com/app/narongdejsrn/pythaiwordcut?utm_source=github.com&utm_medium=referral&utm_content=narongdejsrn/pythaiwordcut&utm_campaign=Badge_Grade)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/pythaiwordcut.svg)\n![PyPI - License](https://img.shields.io/pypi/l/pythaiwordcut.svg)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pythaiwordcut.svg)\n\n-----\n\nA simple Thai wordcut written in Python, based on Maximum Matching algorithm by [S. Manabu](http://www.aclweb.org/anthology/E14-4016)\n. Uses Lexitron (by [NECTEC](http://www.sansarn.com/lexto/license-lexitron.php)) dictionary as default\n\n> Please note: This project is under development and should not be use in production , all function and interface are subject to change. If you have issue or suggestion please feel free to ask, contribution is also very welcome :)\n\n## Installation\n\n```bash\npip install pythaiwordcut\n```\n\nor\n\n```bash\ngit clone https://github.com/zenyai/pythaiwordcut.git\npython setup.py install\n```\n\n## Usage\n\n```python\nimport pythaiwordcut as pwt\n\npt = pwt.wordcut(removeRepeat=True, stopDictionary=\"\", removeSpaces=True, minLength=1, stopNumber=False, removeNonCharacter=False, caseSensitive=True, ngram=(1, 2), negation=False)\nprint \"|\".join(pt.segment(u'\u0e17\u0e14\u0e2a\u0e2d\u0e1a\u0e01\u0e32\u0e23\u0e15\u0e31\u0e14\u0e04\u0e33'))\n```\n\n * removeRepeat: remove intention insertion spelling error such as (\u0e2a\u0e1a\u0e32\u0e22\u0e22\u0e22\u0e22\u0e22\u0e22)\n * stopDictionary: remove word that exist in this specify text file (one word one line)\n * removeSpaces: remove blank space\n * minLength: minimum length of each word\n * stopNumber: remove number if exist\n * removeNonCharacter: remove character that are not Thai or English character\n * caseSensitive: if set to false, will remove stop word without regarding the case\n * ngram: Add word ngram from (1, 2)\n * negation: If set to true, then it will add NOT_ to every word after negation word and space", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/narongdejsrn/pythaiwordcut", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "pythaiwordcut", "package_url": "https://pypi.org/project/pythaiwordcut/", "platform": "", "project_url": "https://pypi.org/project/pythaiwordcut/", "project_urls": { "Homepage": "https://github.com/narongdejsrn/pythaiwordcut" }, "release_url": "https://pypi.org/project/pythaiwordcut/0.2.0/", "requires_dist": null, "requires_python": ">3.5.0", "summary": "Simple Thai Wordcut in Python using Maximum Matching", "version": "0.2.0" }, "last_serial": 5377999, "releases": { "0.1.10": [ { "comment_text": "", "digests": { "md5": "9fb861fe4291bc263ee7e27a1030f363", "sha256": "676c088393e5e840d86de6cd6df22bf24296b818955cc697522ab0f7821747d3" }, "downloads": -1, "filename": "pythaiwordcut-0.1.10.tar.gz", "has_sig": false, "md5_digest": "9fb861fe4291bc263ee7e27a1030f363", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218853, "upload_time": "2016-06-27T10:16:23", "url": "https://files.pythonhosted.org/packages/71/83/fa50daca7b1cefe151fbb65e7aff380678c70e7d420ad99054d09bd65df4/pythaiwordcut-0.1.10.tar.gz" } ], "0.1.11": [ { "comment_text": "", "digests": { "md5": "c939943300641c2e1683a22d3081e34c", "sha256": "d48bd8315b1564722ab1c75c0331b4605fec8d5eb652a958c51accf87cff320d" }, "downloads": -1, "filename": "pythaiwordcut-0.1.11.tar.gz", "has_sig": false, "md5_digest": "c939943300641c2e1683a22d3081e34c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218913, "upload_time": "2016-07-05T13:23:00", "url": "https://files.pythonhosted.org/packages/a5/48/3c46985777842b602f6d22afb60f9ff3a22595180dea3a48e2c38cc9e1e5/pythaiwordcut-0.1.11.tar.gz" } ], "0.1.12": [ { "comment_text": "", "digests": { "md5": "2478966a2cda45ad125edb89caf77793", "sha256": "ddde261cc49c3ccde5e5a751d9b0df9277062ec8bd29377081e9f933954f6541" }, "downloads": -1, "filename": "pythaiwordcut-0.1.12.tar.gz", "has_sig": false, "md5_digest": "2478966a2cda45ad125edb89caf77793", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218926, "upload_time": "2016-07-05T13:53:23", "url": "https://files.pythonhosted.org/packages/f6/4f/9727713e9c29f5591c6f4080de4c649129df7936c85e9af9aec122864efc/pythaiwordcut-0.1.12.tar.gz" } ], "0.1.13": [ { "comment_text": "", "digests": { "md5": "0a8fa480e341ad2204e10ceb755faac5", "sha256": "9ed847bdf17e5d3b50e9ae258ad7cd05d30f24f86988aa9c7c63bc9ea48d903f" }, "downloads": -1, "filename": "pythaiwordcut-0.1.13.tar.gz", "has_sig": false, "md5_digest": "0a8fa480e341ad2204e10ceb755faac5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218953, "upload_time": "2016-07-05T15:31:55", "url": "https://files.pythonhosted.org/packages/3b/76/425c0c2c31da8af23b33d16a06ff9ba3d4f42255544416a506b917e23cc0/pythaiwordcut-0.1.13.tar.gz" } ], "0.1.14": [ { "comment_text": "", "digests": { "md5": "31cf1efba08742c68a7f94c86628c88d", "sha256": "b16cb35273b2a68ad3a65d227523191620d898c76a68e6491b5fdc8b409b577a" }, "downloads": -1, "filename": "pythaiwordcut-0.1.14.tar.gz", "has_sig": false, "md5_digest": "31cf1efba08742c68a7f94c86628c88d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218970, "upload_time": "2016-07-08T09:47:18", "url": "https://files.pythonhosted.org/packages/8d/ff/0adf6b11e449e160f84b19d65e3874954454f817df2674fd4f60c855dfdd/pythaiwordcut-0.1.14.tar.gz" } ], "0.1.15": [ { "comment_text": "", "digests": { "md5": "ebeb0370a3be519b6223376813b33659", "sha256": "8c459ae0cb6607966738dd3a9d626bcb601b4d83bd9c3689a2072a6997a738ac" }, "downloads": -1, "filename": "pythaiwordcut-0.1.15-py2-none-any.whl", "has_sig": false, "md5_digest": "ebeb0370a3be519b6223376813b33659", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 227740, "upload_time": "2019-01-13T15:27:58", "url": "https://files.pythonhosted.org/packages/33/cd/309b0db7ea182aed59eadea9ceefb3c399dd25284ea13fc2330724ad0026/pythaiwordcut-0.1.15-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "aa5fa2a0d459e347982d37806ef1ff30", "sha256": "f79bc83a01dfefe59e9a6608ac2b7bf583b9d3ff1540ea5ed5139f1aafdc8ded" }, "downloads": -1, "filename": "pythaiwordcut-0.1.15.tar.gz", "has_sig": false, "md5_digest": "aa5fa2a0d459e347982d37806ef1ff30", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218891, "upload_time": "2019-01-13T15:28:00", "url": "https://files.pythonhosted.org/packages/83/af/08112ea47fb89254f93224a41df66c5d819621095db8fb1a766bd1b87a6c/pythaiwordcut-0.1.15.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "176b639eb5fdf5853c516919c5f6449d", "sha256": "03284eb40ce8988fc1aceaa1aad5866f9cbc4c792acb812b4528c1bdffdde4f0" }, "downloads": -1, "filename": "pythaiwordcut-0.1.4.tar.gz", "has_sig": false, "md5_digest": "176b639eb5fdf5853c516919c5f6449d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218376, "upload_time": "2016-06-15T18:32:19", "url": "https://files.pythonhosted.org/packages/d1/8e/7b541ce0c53890be25c79bad324731099fa46ffb12da0f6388a16292255f/pythaiwordcut-0.1.4.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "d3667eb12fe890be682d977ed03e07b0", "sha256": "65b6758d514a1252278b30cbf053d837290f24ad34c1a5dfb45a5f22c6c03fd2" }, "downloads": -1, "filename": "pythaiwordcut-0.1.5.tar.gz", "has_sig": false, "md5_digest": "d3667eb12fe890be682d977ed03e07b0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218029, "upload_time": "2016-06-17T17:07:06", "url": "https://files.pythonhosted.org/packages/02/68/c6373f98237856c411e8466745924567d287d9c9940195a16bd7f9cc41cd/pythaiwordcut-0.1.5.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "f1cc80cd0090bbe82055b8da39493aaa", "sha256": "5c4c78b5ec2e66addab7958729c888c0af073c9060639db1244eb37ae80cd2fe" }, "downloads": -1, "filename": "pythaiwordcut-0.1.6.tar.gz", "has_sig": false, "md5_digest": "f1cc80cd0090bbe82055b8da39493aaa", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218179, "upload_time": "2016-06-18T08:18:28", "url": "https://files.pythonhosted.org/packages/04/81/3cc571214f8400f5a57a06401d108f63abdda5882a4fbe2cd424ccf9c730/pythaiwordcut-0.1.6.tar.gz" } ], "0.1.7": [ { "comment_text": "", "digests": { "md5": "ba5141818b8419bb35f4f2e68cf9393c", "sha256": "80ff54fbc3bcca3d7f9a4afff47c2b04e5edefccf756e41d8633f5889a2b3487" }, "downloads": -1, "filename": "pythaiwordcut-0.1.7.tar.gz", "has_sig": false, "md5_digest": "ba5141818b8419bb35f4f2e68cf9393c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218351, "upload_time": "2016-06-18T09:37:04", "url": "https://files.pythonhosted.org/packages/1a/2b/8294a8f0b35f20624131334864260a6d323eeb6a88eb410e583eba956594/pythaiwordcut-0.1.7.tar.gz" } ], "0.1.8": [ { "comment_text": "", "digests": { "md5": "4a0c4b1ad0854644e77868a9e62e3c2c", "sha256": "b13858ce2ee308b1e23d921aeace75faacda150c705d70375dc23d7272790fb5" }, "downloads": -1, "filename": "pythaiwordcut-0.1.8.tar.gz", "has_sig": false, "md5_digest": "4a0c4b1ad0854644e77868a9e62e3c2c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218594, "upload_time": "2016-06-18T09:57:08", "url": "https://files.pythonhosted.org/packages/d6/71/bc97d4911a4127f22aa35711f0e680983c1852501de379658d265cce659f/pythaiwordcut-0.1.8.tar.gz" } ], "0.1.9": [ { "comment_text": "", "digests": { "md5": "e6a9d07c26e8d74f065bf6818c6ca3c6", "sha256": "57b121f80c4f48250872ee0472ecd3ca6bc172de58ea7863adb8653fdbb0a31d" }, "downloads": -1, "filename": "pythaiwordcut-0.1.9.tar.gz", "has_sig": false, "md5_digest": "e6a9d07c26e8d74f065bf6818c6ca3c6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 218575, "upload_time": "2016-06-18T10:13:52", "url": "https://files.pythonhosted.org/packages/e8/a0/f97cc7c533d6a053b65fc8555fa06940b8642744fa1e47fce26a5ca565e4/pythaiwordcut-0.1.9.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "578a610421db0920e1370f76c46dcb2a", "sha256": "7aba80dc5a415305fecb8ff1dd2ce4769b94474851811e7b045760874bf69b14" }, "downloads": -1, "filename": "pythaiwordcut-0.2.0.tar.gz", "has_sig": false, "md5_digest": "578a610421db0920e1370f76c46dcb2a", "packagetype": "sdist", "python_version": "source", "requires_python": ">3.5.0", "size": 219623, "upload_time": "2019-06-09T15:53:53", "url": "https://files.pythonhosted.org/packages/68/77/a4378cd235bd7ae0350d888b4baf855c6e1a094474c5a7d62f060587a92a/pythaiwordcut-0.2.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "578a610421db0920e1370f76c46dcb2a", "sha256": "7aba80dc5a415305fecb8ff1dd2ce4769b94474851811e7b045760874bf69b14" }, "downloads": -1, "filename": "pythaiwordcut-0.2.0.tar.gz", "has_sig": false, "md5_digest": "578a610421db0920e1370f76c46dcb2a", "packagetype": "sdist", "python_version": "source", "requires_python": ">3.5.0", "size": 219623, "upload_time": "2019-06-09T15:53:53", "url": "https://files.pythonhosted.org/packages/68/77/a4378cd235bd7ae0350d888b4baf855c6e1a094474c5a7d62f060587a92a/pythaiwordcut-0.2.0.tar.gz" } ] }