{ "info": { "author": "Masaaki Shibata", "author_email": "mshibata@emptypage.jp", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Topic :: Text Processing" ], "description": "Introduction\n============\n\nA pure Python module to determine Unicode text segmentations.\n\nYou can see the full documentation including the package reference on \nhttp://uniseg-python.readthedocs.org.\n\n\nFeatures\n========\n\nThis package provides:\n\n- Functions to get Unicode Character Database (UCD) properties concerned with \n text segmentations.\n- Functions to determin segmentation boundaries of Unicode strings.\n- Classes that help implement Unicode-aware text wrapping on both console \n (monospace) and graphical (monospace / propotional) font environments.\n\nSupporting segmentations are:\n\n*code point*\n `Code point `_ is *\"any value \n in the Unicode codespace.\"* It is the basic unit for processing Unicode \n strings.\n*grapheme cluster*\n `Grapheme cluster `_ \n approximately represents *\"user-perceived character.\"* They may be made \n up of single or multiple Unicode code points. e.g. \"G\" + *acute-accent* is \n a *user-perceived character*.\n*word break*\n Word boundaries are familiar segmentation in many common text operations. \n e.g. Unit for text highlighting, cursor jumping etc. Note that *words* are \n not determinable only by spaces or punctuations in text in some languages. \n Such languages like Thai or Japanese require dictionaries to determine \n appropriate word boundaries. Though the package only provides simple word \n breaking implementation which is based on the scripts and doesn't use any \n dictionaires, it also provides ways to customize its default behaviours.\n*sentensce break*\n Sentence breaks are also common in text processing but they are more \n contextual and less formal. The sentence breaking implementation (which is \n specified in UAX: Unicode Standard Annex) in the package is simple and \n formal too. But it must be still useful in some usages.\n*line break*\n Implementing line breaking algorithm is one of the key features of this \n package. The feature is important in many general text presentations in \n both CLI and GUI applications.\n\n\nRequirements\n============\n\n- Python 2.7 / 3.3 / 3.4\n\n\nDownload\n========\n\nSource / binary distributions (PyPI)\n https://pypi.python.org/pypi/uniseg\nAll sources and build tools etc. (Bitbucket)\n https://bitbucket.org/emptypage/uniseg-python\n\n\nInstall\n=======\n\nJust type::\n\n % pip install uniseg\n\nor download the archive and::\n\n % python setup.py install\n\n\nChanges\n=======\n\n0.7.1 (2015-05-02)\n - CHANGE: wrap.Wrapper.wrap(): returns the count of lines now.\n - Separate LICENSE from README.txt for the packaging-related reason in some \n environments.\n0.7.0 (2015-02-27)\n - CHANGE: Quited gathering all submodules's members on the top, uniseg \n module.\n - CHANGE: Reform ``uniseg.wrap`` module and sample scripts.\n - Maintained uniseg.wrap module, and sample scripts work again.\n0.6.4 (2015-02-10)\n - Add ``uniseg-dbpath`` console command, which just print the path of \n ``ucd.sqlite3``.\n - Include sample scripts under the package's subdirectory.\n0.6.3 (2015-01-25)\n - Python 3.4\n - Support modern setuptools, pip and wheel.\n0.6.2 (2013-06-09)\n - Python 3.3\n0.6.1 (2013-06-08)\n - Unicode 6.2.0\n\n\nReferences\n==========\n\n*UAX #14: Unicode Line Breaking Algorithm* (6.2.0)\n http://www.unicode.org/reports/tr14/tr14-30.html\n*UAX #29 Unicode Text Segmentation* (6.2.0)\n http://www.unicode.org/reports/tr29/tr29-21.html\n\n\nRelated / Similar Projects\n==========================\n\n`PyICU `_ - Python extension wrapping the ICU C++ API\n *PyICU* is a Python extension wrapping International Components for \n Unicode library (ICU). It also provides text segmentation supports and \n they just perform richer and faster than those of ours. PyICU is an \n extension library so it requires ICU dynamic library (binary files) and \n compiler to build the extention. Our package is written in pure Python; \n it runs slower but is more portable.\n`pytextseg `_ - Python module for text segmentation\n *pytextseg* package forcuses very similar goal to ours; it provides \n Unicode-aware text wrapping features. They designed and uses their \n original string class (not built-in ``unicode`` / ``str`` classes) for the \n purpose. We use strings as just ordinary built-in ``unicode`` / ``str`` \n objects for text processing in our modules.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://bitbucket.org/emptypage/uniseg-python", "keywords": null, "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "uniseg", "package_url": "https://pypi.org/project/uniseg/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/uniseg/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://bitbucket.org/emptypage/uniseg-python" }, "release_url": "https://pypi.org/project/uniseg/0.7.1/", "requires_dist": null, "requires_python": null, "summary": "A pure Python library to determine Unicode text segmentations", "version": "0.7.1" }, "last_serial": 1535105, "releases": { "0.6.0": [ { "comment_text": "", "digests": { "md5": "2b3fec1e18ec6c2e6fdeb3b7f0a12269", "sha256": "0ace5cd55454c426e02100c78e8105dcdddd89ddc6f140c324f046445c70f4ce" }, "downloads": -1, "filename": "uniseg-0.6.0.zip", "has_sig": false, "md5_digest": "2b3fec1e18ec6c2e6fdeb3b7f0a12269", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1504574, "upload_time": "2013-06-08T08:15:12", "url": "https://files.pythonhosted.org/packages/a9/71/11200d075f49d9771fb27af84f0ae47bc915152f11bf38d9159fac74bce7/uniseg-0.6.0.zip" } ], "0.6.1": [ { "comment_text": "", "digests": { "md5": "6eadd6e1a035f2e48dfa3fe9711b6240", "sha256": "3cad72f5e312e5ab93138966594e3b1dc38564ae6135c701f4ebdb5583d5476e" }, "downloads": -1, "filename": "uniseg-0.6.1.zip", "has_sig": false, "md5_digest": "6eadd6e1a035f2e48dfa3fe9711b6240", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1504791, "upload_time": "2013-06-08T14:27:18", "url": "https://files.pythonhosted.org/packages/e5/0b/e8f6cf38bce93ec1dd57478eebdfcdd4298174ee12f954a93468d8860057/uniseg-0.6.1.zip" } ], "0.6.2": [ { "comment_text": "", "digests": { "md5": "8247b602f77e6aa99189e4c17c5b3ff8", "sha256": "e7b892d1c4e8cca858ac9086162056bf1d9ac8051d2e6cc01ec730e8c81750de" }, "downloads": -1, "filename": "uniseg-0.6.2.zip", "has_sig": false, "md5_digest": "8247b602f77e6aa99189e4c17c5b3ff8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1508156, "upload_time": "2013-06-09T14:07:16", "url": "https://files.pythonhosted.org/packages/5c/6a/1ea9b54846caf8f2b8f8a2c83e1e0b46abd77f83fe942361aff7086f7af9/uniseg-0.6.2.zip" } ], "0.6.3": [ { "comment_text": "", "digests": { "md5": "74d7add4960ff419ee63e5d618bc632f", "sha256": "c211c104a3c3d321e7d118b97a352e3e31ff960cceee6f21018feb46a52f7bed" }, "downloads": -1, "filename": "uniseg-0.6.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "74d7add4960ff419ee63e5d618bc632f", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 1499364, "upload_time": "2015-01-24T16:43:26", "url": "https://files.pythonhosted.org/packages/8e/55/487ab42e439bf09b8527915a0c7c3001a7aac474092d852d01ed79a98dfd/uniseg-0.6.3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "81768e5763349dd42c77d1b04a4890a6", "sha256": "63ee31ab47a921688f358757ccfc229d29b4c69a15d0a161b2fb5fad967a21d7" }, "downloads": -1, "filename": "uniseg-0.6.3.zip", "has_sig": false, "md5_digest": "81768e5763349dd42c77d1b04a4890a6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1509696, "upload_time": "2015-01-24T16:43:07", "url": "https://files.pythonhosted.org/packages/11/4b/e76e71375700b0085a330d5bdf65e8998c2e2e76057aaa8724b786d33898/uniseg-0.6.3.zip" } ], "0.6.4": [ { "comment_text": "", "digests": { "md5": "ae58427b06c2218cd7ad74b4d85c1343", "sha256": "c32f647cd9a03b8aa69d6531539a717caa8441d21711c55368ccea271f337dcb" }, "downloads": -1, "filename": "uniseg-0.6.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "ae58427b06c2218cd7ad74b4d85c1343", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 1511634, "upload_time": "2015-02-10T20:24:37", "url": "https://files.pythonhosted.org/packages/7d/ea/83e7878761650db2ccba185ace385189544070d3114cd5a3dbc48dd8545b/uniseg-0.6.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4dc6467fc895256c68df2375effbe2a8", "sha256": "b87ac0dcb87c0da50e98a63800dc6359485a035bd1fb2313113c56e037376c19" }, "downloads": -1, "filename": "uniseg-0.6.4.zip", "has_sig": false, "md5_digest": "4dc6467fc895256c68df2375effbe2a8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1517950, "upload_time": "2015-02-10T20:24:19", "url": "https://files.pythonhosted.org/packages/20/d8/be076d6b379b9ac54a1e447b043c04579cc4b14a10837411609afab1dc98/uniseg-0.6.4.zip" } ], "0.7.0": [ { "comment_text": "", "digests": { "md5": "02ffb50a635b1d0115d4562afd12d048", "sha256": "43baeaf3c52d2ff1968ab11314a298bb50ee93870bbe661a3c4bfb790b46f6b5" }, "downloads": -1, "filename": "uniseg-0.7.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "02ffb50a635b1d0115d4562afd12d048", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 1511467, "upload_time": "2015-02-27T04:26:06", "url": "https://files.pythonhosted.org/packages/6e/91/39206b3319b0306fc4f06bbac0942c8f4a560b1a191654f9a92ab1dedf93/uniseg-0.7.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4ac0372cd0e468decb3a8bd39544c89f", "sha256": "31a369437ea297777794b448cb4dcb9592093cbf50c03f4591b72e0fd434310d" }, "downloads": -1, "filename": "uniseg-0.7.0.zip", "has_sig": false, "md5_digest": "4ac0372cd0e468decb3a8bd39544c89f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1518080, "upload_time": "2015-02-27T04:25:48", "url": "https://files.pythonhosted.org/packages/b8/32/a16b01757b1247fdcff747e516023ac3d9f9442b91966070627692eba180/uniseg-0.7.0.zip" } ], "0.7.1": [ { "comment_text": "", "digests": { "md5": "313676bd25ac8b4678f809b6f9db2f84", "sha256": "5060897a089f290e0007067c37b03a8aae2e4a12c2081b249901db74ce8abc04" }, "downloads": -1, "filename": "uniseg-0.7.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "313676bd25ac8b4678f809b6f9db2f84", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 1510309, "upload_time": "2015-05-06T06:41:40", "url": "https://files.pythonhosted.org/packages/82/7d/0b91abc510259b2a4bea28c90db1f9d63cbb731e2d5a4e14f3b4ebd3d2d2/uniseg-0.7.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1dd59c7a1f567dc31b2ccd6a666a0548", "sha256": "e2c76af3ab0f8bfbb66feff91bfa87c1e2d97c8137873da8817559f3f7575a16" }, "downloads": -1, "filename": "uniseg-0.7.1.zip", "has_sig": false, "md5_digest": "1dd59c7a1f567dc31b2ccd6a666a0548", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1517035, "upload_time": "2015-05-06T06:41:21", "url": "https://files.pythonhosted.org/packages/5d/31/c70a45f83f10af81666de047caa3219ff221024a9d3ff5e1dd9511750fcb/uniseg-0.7.1.zip" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "313676bd25ac8b4678f809b6f9db2f84", "sha256": "5060897a089f290e0007067c37b03a8aae2e4a12c2081b249901db74ce8abc04" }, "downloads": -1, "filename": "uniseg-0.7.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "313676bd25ac8b4678f809b6f9db2f84", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 1510309, "upload_time": "2015-05-06T06:41:40", "url": "https://files.pythonhosted.org/packages/82/7d/0b91abc510259b2a4bea28c90db1f9d63cbb731e2d5a4e14f3b4ebd3d2d2/uniseg-0.7.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1dd59c7a1f567dc31b2ccd6a666a0548", "sha256": "e2c76af3ab0f8bfbb66feff91bfa87c1e2d97c8137873da8817559f3f7575a16" }, "downloads": -1, "filename": "uniseg-0.7.1.zip", "has_sig": false, "md5_digest": "1dd59c7a1f567dc31b2ccd6a666a0548", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1517035, "upload_time": "2015-05-06T06:41:21", "url": "https://files.pythonhosted.org/packages/5d/31/c70a45f83f10af81666de047caa3219ff221024a9d3ff5e1dd9511750fcb/uniseg-0.7.1.zip" } ] }