{ "info": { "author": "Le Tuan Anh", "author_email": "tuananh.ke@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Environment :: Plugins", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Text Processing" ], "description": "ChirpText is a collection of text processing tools for Python.\nIt is not meant to be a powerful tank like the popular NTLK but a small package which you can pip-install anywhere and write a few lines of code to process textual data.\n\n# Main features\n\n* **[New]** Does not require `mecab-python3` package to use MeCab/Deko on Windows. Only binary release (`mecab.exe`) is required.\n* Text annotation framework (TTL, a.k.a TextTagLib format) which can import/export JSON or human-readable text files\n* Helper functions and useful data for processing English, Japanese, Chinese and Vietnamese.\n* Quick text-based report generation\n* Application configuration files management which can make educated guess about config files' whereabouts\n* Web fetcher with responsible web crawling ethics (support caching out of the box)\n* CSV helper functions\n* Console application template\n\nProject homepage: [https://github.com/letuananh/chirptext](https://github.com/letuananh/chirptext)\n\n# Installation\n\n```bash\npip install chirptext\n# pip script sometimes doesn't work properly, so you may want to try this instead\npython3 -m pip install chirptext\n```\n**Note**: chirptext library does not support Python 2 anymore. Please update to Python 3 to use this package.\n\n# Sample codes\n\n## Using MeCab on Windows\nYou can download mecab binary package from [http://taku910.github.io/mecab/#download](http://taku910.github.io/mecab/#download) and install it.\nAfter installed you can try:\n```python\n>>> from chirptext import deko\n>>> sent = deko.parse('\u732b\u304c\u597d\u304d\u3067\u3059\u3002')\n>>> sent.tokens\n[[\u732b(\u540d\u8a5e-\u4e00\u822c/*/*|\u732b|\u30cd\u30b3|\u30cd\u30b3)], [\u304c(\u52a9\u8a5e-\u683c\u52a9\u8a5e/\u4e00\u822c/*|\u304c|\u30ac|\u30ac)], [\u597d\u304d(\u540d\u8a5e-\u5f62\u5bb9\u52d5\u8a5e\u8a9e\u5e79/*/*|\u597d\u304d|\u30b9\u30ad|\u30b9\u30ad)], [\u3067\u3059(\u52a9\u52d5\u8a5e-*/*/*|\u3067\u3059|\u30c7\u30b9|\u30c7\u30b9)], [\u3002(\u8a18\u53f7-\u53e5\u70b9/*/*|\u3002|\u3002|\u3002)], [EOS(-//|||)]]\n>>> sent.words\n['\u732b', '\u304c', '\u597d\u304d', '\u3067\u3059', '\u3002']\n>>> sent[0].pos\n'\u540d\u8a5e'\n>>> sent[0].root\n'\u732b'\n>>> sent[0].reading\n'\u30cd\u30b3'\n```\n\nIf you installed MeCab to a custom location, for example `C:\\mecab\\bin\\mecab.exe`, try\n```python\n>>> deko.set_mecab_bin(\"C:\\\\mecab\\\\bin\\\\mecab.exe\")\n>>> deko.get_mecab_bin()\n'C:\\\\mecab\\\\bin\\\\mecab.exe'\n\n# Just that & now you can use mecab\n>>> deko.parse('\u96e8\u304c\u964d\u308b\u3002').words\n['\u96e8', '\u304c', '\u964d\u308b', '\u3002']\n```\n\n## Convenient IO APIs\n\n```python\n>>> from chirptext import chio\n>>> chio.write_tsv('data/test.tsv', [['a', 'b'], ['c', 'd']])\n>>> chio.read_tsv('data/tes.tsv')\n[['a', 'b'], ['c', 'd']]\n\n>>> chio.write_file('data/content.tar.gz', 'Support writing to .tar.gz file')\n>>> chio.read_file('data/content.tar.gz')\n'Support writing to .tar.gz file'\n\n>>> for row in chio.read_tsv_iter('data/test.tsv'):\n... print(row)\n... \n['a', 'b']\n['c', 'd']\n```\n\n## Web fetcher\n\n```python\nfrom chirptext import WebHelper\n\nweb = WebHelper('~/tmp/webcache.db')\ndata = web.fetch('https://letuananh.github.io/test/data.json')\ndata\n>>> b'{ \"name\": \"Kungfu Panda\" }\\n'\ndata_json = web.fetch_json('https://letuananh.github.io/test/data.json')\ndata_json\n>>> {'name': 'Kungfu Panda'}\n```\n\n## Using Counter\n\n```python\nfrom chirptext import Counter, TextReport\nfrom chirptext.leutile import LOREM_IPSUM\n\nct = Counter()\nvc = Counter() # vowel counter\nfor char in LOREM_IPSUM:\n if char == ' ':\n continue\n ct.count(char)\n vc.count(\"Letters\")\n if char in 'auieo':\n vc.count(\"Vowels\")\n else:\n vc.count(\"Consonants\")\nvc.summarise()\nct.summarise(byfreq=True, limit=5)\n```\n\n### Output\n\n```\nLetters: 377 \nConsonants: 212 \nVowels: 165 \ni: 42 \ne: 37 \nt: 32 \no: 29 \na: 29 \n```\n\n## Sample TextReport\n\n```python\n# a string report\nrp = TextReport() # by default, TextReport will write to standard output, i.e. terminal\nrp = TextReport(TextReport.STDOUT) # same as above\nrp = TextReport('~/tmp/my-report.txt') # output to a file\nrp = TextReport.null() # ouptut to /dev/null, i.e. nowhere\nrp = TextReport.string() # output to a string. Call rp.content() to get the string\nrp = TextReport(TextReport.STRINGIO) # same as above\n\n# TextReport will close the output stream automatically by using the with statement\nwith TextReport.string() as rp:\n rp.header(\"Lorem Ipsum Analysis\", level=\"h0\")\n rp.header(\"Raw\", level=\"h1\")\n rp.print(LOREM_IPSUM)\n rp.header(\"Top 5 most common letters\")\n ct.summarise(report=rp, limit=5)\n print(rp.content())\n```\n\n### Output\n```\n+---------------------------------------------------------------------------------- \n| Lorem Ipsum Analysis \n+---------------------------------------------------------------------------------- \n \nRaw \n------------------------------------------------------------ \nLorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. \n \nTop 5 most common letters\n------------------------------------------------------------ \ni: 42 \ne: 37 \nt: 32 \no: 29 \na: 29 \n```", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/letuananh/chirptext", "keywords": "nlp", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "chirptext", "package_url": "https://pypi.org/project/chirptext/", "platform": "any", "project_url": "https://pypi.org/project/chirptext/", "project_urls": { "Bug Tracker": "https://github.com/letuananh/chirptext/issues", "Homepage": "https://github.com/letuananh/chirptext", "Source Code": "https://github.com/letuananh/chirptext/" }, "release_url": "https://pypi.org/project/chirptext/0.1a18/", "requires_dist": null, "requires_python": "", "summary": "ChirpText is a collection of text processing tools for Python.", "version": "0.1a18" }, "last_serial": 4076227, "releases": { "0.1a10": [ { "comment_text": "", "digests": { "md5": "5dac7d9a79410e1030fe3e1e78731f56", "sha256": "d98ab0670d937e0c8502a160498581a3537c193c6e02078dff45427f875560af" }, "downloads": -1, "filename": "chirptext-0.1a10.tar.gz", "has_sig": false, "md5_digest": "5dac7d9a79410e1030fe3e1e78731f56", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 56217, "upload_time": "2018-03-29T05:01:44", "url": "https://files.pythonhosted.org/packages/fb/87/9a52abdcbce5110733ab94389716f0fba379546d0125cc9320407b07b342/chirptext-0.1a10.tar.gz" } ], "0.1a11": [ { "comment_text": "", "digests": { "md5": "462143895c8fc00b7ba0b73c2ea3f878", "sha256": "4ed103951fb7acc783e2895bb3eaca3cdfbd43d10d9263b337f71947102369a2" }, "downloads": -1, "filename": "chirptext-0.1a11.tar.gz", "has_sig": false, "md5_digest": "462143895c8fc00b7ba0b73c2ea3f878", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 57762, "upload_time": "2018-04-02T13:23:14", "url": "https://files.pythonhosted.org/packages/34/30/0edbbdf958545f6cfd65c32397e6640474500c08e8c117f0b00103f99f52/chirptext-0.1a11.tar.gz" } ], "0.1a12": [ { "comment_text": "", "digests": { "md5": "4ea3baab69f91ada42485cd94c30c6ae", "sha256": "afdb45062e7e5193362d8f51f06f50e0c5614db1f44a4191983daebcaf9dfd4c" }, "downloads": -1, "filename": "chirptext-0.1a12.tar.gz", "has_sig": false, "md5_digest": "4ea3baab69f91ada42485cd94c30c6ae", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 58883, "upload_time": "2018-04-03T12:58:05", "url": "https://files.pythonhosted.org/packages/51/dd/46efe5e54538f9b5db30bd7045b012b27267de1b9b92dd3196cc386cb93b/chirptext-0.1a12.tar.gz" } ], "0.1a13": [ { "comment_text": "", "digests": { "md5": "6796674a0e3a884bd93bc2423ba1ae1e", "sha256": "6afd5a2e8f72cb8a607a288bf372ea2a551abb5c7a554d34fdb9248f23486fdd" }, "downloads": -1, "filename": "chirptext-0.1a13.tar.gz", "has_sig": false, "md5_digest": "6796674a0e3a884bd93bc2423ba1ae1e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 58711, "upload_time": "2018-04-04T13:52:38", "url": "https://files.pythonhosted.org/packages/af/11/d80b808ec6ac788187ed22d9016232011a4e7ab930d269833d5aa429feb2/chirptext-0.1a13.tar.gz" } ], "0.1a14": [ { "comment_text": "", "digests": { "md5": "148f1e17e31cdab86bd75a2b1bc9df99", "sha256": "87b36581ce1f05cd56055edefdb5fbbbb6802a9a08511b4c953aa30bb99f6743" }, "downloads": -1, "filename": "chirptext-0.1a14.tar.gz", "has_sig": false, "md5_digest": "148f1e17e31cdab86bd75a2b1bc9df99", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 61074, "upload_time": "2018-04-11T04:24:43", "url": "https://files.pythonhosted.org/packages/cb/a6/20f7529020c6cab5746a952da2b14a3be65cd74856742595322657574212/chirptext-0.1a14.tar.gz" } ], "0.1a15": [ { "comment_text": "", "digests": { "md5": "5c2b4a7711189162a532069111c07bb1", "sha256": "268e3f2027d14869ba05375a3d02abacdfb9e7778f146e954fe98996e78dbb75" }, "downloads": -1, "filename": "chirptext-0.1a15.tar.gz", "has_sig": false, "md5_digest": "5c2b4a7711189162a532069111c07bb1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 62856, "upload_time": "2018-04-11T07:41:20", "url": "https://files.pythonhosted.org/packages/4d/d8/9fd503c87741bf9ed38b2f629c11a7535982469f65c75867efdbda4ce09a/chirptext-0.1a15.tar.gz" } ], "0.1a16": [ { "comment_text": "", "digests": { "md5": "c1d2ca546ba3dfb7b1c2841e118f0ba0", "sha256": "83e67eff8dd8ded74c0f67c449d8d77e988586d851073a0c344f470d6fc7777f" }, "downloads": -1, "filename": "chirptext-0.1a16.tar.gz", "has_sig": false, "md5_digest": "c1d2ca546ba3dfb7b1c2841e118f0ba0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 64574, "upload_time": "2018-04-16T16:27:15", "url": "https://files.pythonhosted.org/packages/fc/67/6ca8c0bb6cae5a168e9617dcc688f7af247e85077944defa4e6b4f47b31a/chirptext-0.1a16.tar.gz" } ], "0.1a17": [ { "comment_text": "", "digests": { "md5": "bf99f6f6022f151597d1bf5ce6205e4d", "sha256": "3e7b06e5326b766b216434518023f9feab519b805725e1c4fbf182f6c4567072" }, "downloads": -1, "filename": "chirptext-0.1a17.tar.gz", "has_sig": false, "md5_digest": "bf99f6f6022f151597d1bf5ce6205e4d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 66118, "upload_time": "2018-05-30T15:31:40", "url": "https://files.pythonhosted.org/packages/29/7b/26718421e97adcc87b890d8962c6ca339518fd86f0fba36fd3970b16445a/chirptext-0.1a17.tar.gz" } ], "0.1a18": [ { "comment_text": "", "digests": { "md5": "b02ddb49a2334dfc6974ce7951b41d29", "sha256": "2929017ac8a875860bb23a7d9fcfa4f3e7797c4fc894872ec1f24afcde947631" }, "downloads": -1, "filename": "chirptext-0.1a18.tar.gz", "has_sig": false, "md5_digest": "b02ddb49a2334dfc6974ce7951b41d29", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 65000, "upload_time": "2018-07-18T03:47:09", "url": "https://files.pythonhosted.org/packages/13/7d/59add42be3a6e3591f1c2a848a9034a302595ce82caecfab98321f2d2a2c/chirptext-0.1a18.tar.gz" } ], "0.1a2": [ { "comment_text": "", "digests": { "md5": "e3cb07dd3608980926e776412a3d03c5", "sha256": "ed7cf7832dac5b00be0fc4e6d21b348db4c4e1397e4e3fba30e289c5348acf8e" }, "downloads": -1, "filename": "chirptext-0.1a2.tar.gz", "has_sig": false, "md5_digest": "e3cb07dd3608980926e776412a3d03c5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 49331, "upload_time": "2018-01-24T08:49:20", "url": "https://files.pythonhosted.org/packages/e9/9e/80950a317662fdf6079ac5e63323f5c26a22cb82b02005dff5c64533878b/chirptext-0.1a2.tar.gz" } ], "0.1a3": [ { "comment_text": "", "digests": { "md5": "71631b40d63e03d88d364014f2cde68d", "sha256": "1066e00df3f239ccfeb4978ceb2078f5a11f9cfdf3f6cd62d7f6ca08bf86f53c" }, "downloads": -1, "filename": "chirptext-0.1a3.tar.gz", "has_sig": false, "md5_digest": "71631b40d63e03d88d364014f2cde68d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 49632, "upload_time": "2018-02-05T03:14:26", "url": "https://files.pythonhosted.org/packages/95/84/4111d9e88c33a796b2d4f5028b718bfd964b106ffd3bb584fc5c223e1742/chirptext-0.1a3.tar.gz" } ], "0.1a4": [ { "comment_text": "", "digests": { "md5": "0ba8f7af06c46c153b557acfd7070826", "sha256": "e5b10fbcfac988c40674600e9eec0a669e9cb889cdaa860a0b263c9dd2ff3336" }, "downloads": -1, "filename": "chirptext-0.1a4.tar.gz", "has_sig": false, "md5_digest": "0ba8f7af06c46c153b557acfd7070826", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 52945, "upload_time": "2018-02-05T04:10:11", "url": "https://files.pythonhosted.org/packages/ff/1a/1d35015d5aa6801b5110e172ec9a5e14cdd0cc16de85d0310bced79938e4/chirptext-0.1a4.tar.gz" } ], "0.1a5": [ { "comment_text": "", "digests": { "md5": "39f2d597508842968d92a4ee4abbaa1a", "sha256": "fa8cec741012b740d29e092063a7eef3e8e112b8adfd66c6fc6ed2a914dd279f" }, "downloads": -1, "filename": "chirptext-0.1a5.tar.gz", "has_sig": false, "md5_digest": "39f2d597508842968d92a4ee4abbaa1a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 53743, "upload_time": "2018-02-07T02:38:56", "url": "https://files.pythonhosted.org/packages/ef/aa/edb799bd67e124fa9bf2bed640714517f500762335a0cdddca961de260d9/chirptext-0.1a5.tar.gz" } ], "0.1a6": [ { "comment_text": "", "digests": { "md5": "876b5628140aa3dd219ca796fa89e529", "sha256": "2bf0a0a0c65055b5fafe1d78d26282a1a44b2e344fd52cfc1e2e1503657528c8" }, "downloads": -1, "filename": "chirptext-0.1a6.tar.gz", "has_sig": false, "md5_digest": "876b5628140aa3dd219ca796fa89e529", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 53919, "upload_time": "2018-02-07T03:34:21", "url": "https://files.pythonhosted.org/packages/23/36/aefaa5a27936f2defe4d835762ac2c4363213dae2f6de48f910a7f1ce0d9/chirptext-0.1a6.tar.gz" } ], "0.1a7": [ { "comment_text": "", "digests": { "md5": "c84cf5b8830aba90220d3641845af361", "sha256": "a8d6789228a5c83bb97721da786e5a6e18ddee0fb52d9e6b1770151e416d730f" }, "downloads": -1, "filename": "chirptext-0.1a7.tar.gz", "has_sig": false, "md5_digest": "c84cf5b8830aba90220d3641845af361", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54035, "upload_time": "2018-02-22T04:24:03", "url": "https://files.pythonhosted.org/packages/33/24/2e9fb9f4490f62e17ef142f03ab3d50f1a6a8ffbd6f9b24895d4707530eb/chirptext-0.1a7.tar.gz" } ], "0.1a8": [ { "comment_text": "", "digests": { "md5": "def85d269118443748e8c98f13a30b33", "sha256": "31a4703d61a01b0a189698d89bf88c994caeb4fa543cf51eb6be3c737f9b7b6a" }, "downloads": -1, "filename": "chirptext-0.1a8.tar.gz", "has_sig": false, "md5_digest": "def85d269118443748e8c98f13a30b33", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54229, "upload_time": "2018-02-26T05:54:45", "url": "https://files.pythonhosted.org/packages/c5/10/d480d32bab04cf68c1714ef616a09bc90c33820c894484d33a60beeeacf4/chirptext-0.1a8.tar.gz" } ], "0.1a9": [ { "comment_text": "", "digests": { "md5": "5c078c79739aa3e567a0e10136bffa6c", "sha256": "78f8c3101d5e3b3e7f833e893ef298ab72b3d489cefe7324bedaeb350a5f750a" }, "downloads": -1, "filename": "chirptext-0.1a9.tar.gz", "has_sig": false, "md5_digest": "5c078c79739aa3e567a0e10136bffa6c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 55678, "upload_time": "2018-03-28T05:33:49", "url": "https://files.pythonhosted.org/packages/f7/f8/8947abb019367bf8f2dff9c279438e055b303886f9b5c3fb089fbcf86591/chirptext-0.1a9.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b02ddb49a2334dfc6974ce7951b41d29", "sha256": "2929017ac8a875860bb23a7d9fcfa4f3e7797c4fc894872ec1f24afcde947631" }, "downloads": -1, "filename": "chirptext-0.1a18.tar.gz", "has_sig": false, "md5_digest": "b02ddb49a2334dfc6974ce7951b41d29", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 65000, "upload_time": "2018-07-18T03:47:09", "url": "https://files.pythonhosted.org/packages/13/7d/59add42be3a6e3591f1c2a848a9034a302595ce82caecfab98321f2d2a2c/chirptext-0.1a18.tar.gz" } ] }