{ "info": { "author": "Lu\u00eds Gomes", "author_email": "luismsgomes@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3", "Topic :: Text Processing :: General" ], "description": "uctools\n=======\n\nTools for showing information about unicode characters (UTF-8) and \nperforming normalization.\n\nCopyright \u00ae 2018, Lu\u00eds Gomes .\n\nThe following command line tools are provided:\n\n ucinfo\n writes on stdout the name of each unicode character read from stdin\n\n ucenum\n enumerates on stdout all unicode characters of a chosen category\n\n ucnorm\n applies a standard unicode normalization (NFC, NFKC, NFD or NFKD)\n\nucinfo\n------\n\nThe ucinfo tool reads UTF-8 text from stdin and writes to stdout information\nabout each character, one per line.\nThe output has 5 tab-separated columns:\n\n 1. the character itself, if printable, or an escaped representation of it\n 2. the decimal codepoint of the character\n 3. the number of bytes that the character occupies\n 4. the Unicode category of the character\n 5. the Unicode name of the character\n\nucenum\n------\n\nThe ucenum tool takes a category abbreviation as argument and outputs a list\n of all characters belonging to that category. The categories are:\n\n L\n Letter\n Lu\n Letter, Uppercase\n Ll\n Letter, Lowercase\n Lt\n Letter, Titlecase\n Lm\n Letter, Modifier\n Lo\n Letter, Other\n M\n Mark\n Mn\n Mark, Nonspacing\n Mc\n Mark, Spacing Combining\n Me\n Mark, Enclosing\n N\n Number\n Nd\n Number, Decimal Digit\n Nl\n Number, Letter\n No\n Number, Other\n P\n Punctuation\n Pc\n Punctuation, Connector\n Pd\n Punctuation, Dash\n Ps\n Punctuation, Open\n Pe\n Punctuation, Close\n Pi\n Punctuation, Initial quote (may behave like Ps or Pe depending on usage)\n Pf\n Punctuation, Final quote (may behave like Ps or Pe depending on usage)\n Po\n Punctuation, Other\n S\n Symbol\n Sm\n Symbol, Math\n Sc\n Symbol, Currency\n Sk\n Symbol, Modifier\n So\n Symbol, Other\n Z\n Separator\n Zs\n Separator, Space\n Zl\n Separator, Line\n Zp\n Separator, Paragraph\n C\n Other\n Cc\n Other, Control\n Cf\n Other, Format\n Cs\n Other, Surrogate\n Co\n Other, Private Use\n Cn\n Other, Not Assigned\n\nucnorm\n------\n\nThis program reads UTF-8 text from stdin and writes it to \nstdout after applying the specified normalization algorithm.\n\nThe Unicode standard defines various normalization forms of a Unicode \nstring, based on the definition of canonical equivalence and \ncompatibility equivalence. In Unicode, several characters can be \nexpressed in various way. For example, the character U+00C7 (LATIN\nCAPITAL LETTER C WITH CEDILLA) can also be expressed as the sequence\nU+0043 (LATIN CAPITAL LETTER C) U+0327 (COMBINING CEDILLA).\n\nEven if two unicode strings look the same to a human reader, if one\nhas combining characters and the other doesn\u2019t, they may not compare\nequal.\n\nFor each character, there are two normal forms:\n\n- Normal form D (NFD) is also known as canonical decomposition, and\n translates each character into its decomposed form.\n\n- Normal form C (NFC) first applies a canonical decomposition, then \n composes pre-combined characters again.\n\nIn addition to these two forms, there are two additional normal forms\nbased on compatibility equivalence:\n\n- Normal form KD (NFKD) will apply the compatibility decomposition,\n i.e. replace all compatibility characters with their equivalents.\n\n- Normal form KC (NFKC) first applies the compatibility decomposition,\n followed by the canonical composition.\n\nCompatibility decomposition ensures that equivalent characters will\ncompare equal (i.e. have the same codepoints). In Unicode, certain\ncharacters are supported which normally would be unified with other\ncharacters. For example, U+2160 (ROMAN NUMERAL ONE) is really the\nsame thing as U+0049 (LATIN CAPITAL LETTER I). However, it is \nsupported in Unicode for compatibility with existing character sets\n(e.g. gb2312).\n\nThis program uses the normalization algorithms implemented in Python's\nstandard library. See:\nhttps://docs.python.org/3/library/unicodedata.html#unicodedata.normalize", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/luismsgomes/uctools", "keywords": "text unicode character", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "uctools", "package_url": "https://pypi.org/project/uctools/", "platform": "", "project_url": "https://pypi.org/project/uctools/", "project_urls": { "Homepage": "https://github.com/luismsgomes/uctools" }, "release_url": "https://pypi.org/project/uctools/1.3.0/", "requires_dist": null, "requires_python": ">=3", "summary": "Unicode tools.", "version": "1.3.0", "yanked": false, "yanked_reason": null }, "last_serial": 7697316, "releases": { "1.0.2": [ { "comment_text": "", "digests": { "md5": "0b5be980006a64e08a232ca6c46d0131", "sha256": "d24552c6c30173b04b343eca1b64fcd189784aa6164974fec655975bf82f3809" }, "downloads": -1, "filename": "uctools-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "0b5be980006a64e08a232ca6c46d0131", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3718, "upload_time": "2018-06-04T13:18:45", "upload_time_iso_8601": "2018-06-04T13:18:45.111327Z", "url": "https://files.pythonhosted.org/packages/25/a1/6be5c1e6418eba31411c733a8a076104e9349123bb05390caf213692c2c5/uctools-1.0.2-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "531660550c7c68a22f507c21a47fdced", "sha256": "200cd356d01f5f9257292e1c3d19921fb1c224a1a806590964504fe81c43275c" }, "downloads": -1, "filename": "uctools-1.0.2.tar.gz", "has_sig": false, "md5_digest": "531660550c7c68a22f507c21a47fdced", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3287, "upload_time": "2018-06-04T13:18:46", "upload_time_iso_8601": "2018-06-04T13:18:46.401985Z", "url": "https://files.pythonhosted.org/packages/5e/8c/23c2a9f39e256f66b16aa29bd07bd0c8fc2e0175ec8c09840cddc271b6b8/uctools-1.0.2.tar.gz", "yanked": false, "yanked_reason": null } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "d803b19b0680a450c6abdd027aa2471c", "sha256": "63e81430d4d0bf12236d514f8f6abdb9a40dde93ff65c91a69002fe6ea66e35f" }, "downloads": -1, "filename": "uctools-1.1.0.tar.gz", "has_sig": false, "md5_digest": "d803b19b0680a450c6abdd027aa2471c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 3409, "upload_time": "2019-10-24T10:42:02", "upload_time_iso_8601": "2019-10-24T10:42:02.227524Z", "url": "https://files.pythonhosted.org/packages/9f/e0/4b5d543eb16596c96742d602478b9fedf439ea4f08c3f3db73f15073f1d8/uctools-1.1.0.tar.gz", "yanked": false, "yanked_reason": null } ], "1.1.1": [ { "comment_text": "", "digests": { "md5": "ad6ade42b3ad4268ef55e07cfa2cd83e", "sha256": "8e09ef75d8faa069d6a404da5130caae4ba4e8a6ca834000b1939762c62437b8" }, "downloads": -1, "filename": "uctools-1.1.1.tar.gz", "has_sig": false, "md5_digest": "ad6ade42b3ad4268ef55e07cfa2cd83e", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 3408, "upload_time": "2019-10-24T10:47:45", "upload_time_iso_8601": "2019-10-24T10:47:45.436604Z", "url": "https://files.pythonhosted.org/packages/54/54/b675a4f787df3f21be51aa25e4ad67223fded3b9285b3b1a3fefe03d3de1/uctools-1.1.1.tar.gz", "yanked": false, "yanked_reason": null } ], "1.2.0": [ { "comment_text": "", "digests": { "md5": "f515ada66a8f0151a0c49d4351f65cdb", "sha256": "4695c0b44632ce92b06c39063bdd5f45c69ba4be5ffe7a32696903105573df53" }, "downloads": -1, "filename": "uctools-1.2.0.tar.gz", "has_sig": false, "md5_digest": "f515ada66a8f0151a0c49d4351f65cdb", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 3561, "upload_time": "2019-10-24T11:10:07", "upload_time_iso_8601": "2019-10-24T11:10:07.749045Z", "url": "https://files.pythonhosted.org/packages/ca/a1/d891ca76c9645777e803321b8203ba4b59715ad0df48fc3819ef10d486e7/uctools-1.2.0.tar.gz", "yanked": false, "yanked_reason": null } ], "1.2.1": [ { "comment_text": "", "digests": { "md5": "49fd464d1d2eaf29cdcc5b8e50c9e4a4", "sha256": "f89148a7fcb31719215fabba7f605185a41df62ccd456558662c8bc76f918ea2" }, "downloads": -1, "filename": "uctools-1.2.1.tar.gz", "has_sig": false, "md5_digest": "49fd464d1d2eaf29cdcc5b8e50c9e4a4", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 3569, "upload_time": "2019-10-24T11:21:23", "upload_time_iso_8601": "2019-10-24T11:21:23.416396Z", "url": "https://files.pythonhosted.org/packages/63/6e/15f479cb4d1168f07d875be369ffc08fa0f900419f71a379aeb2882a775d/uctools-1.2.1.tar.gz", "yanked": false, "yanked_reason": null } ], "1.3.0": [ { "comment_text": "", "digests": { "md5": "b2c12b534fc0234bdf5dfc50950955be", "sha256": "6bc0d947215af420534c0f62dcc0272f0b6066ed5545eec88e9a3c2313f6b826" }, "downloads": -1, "filename": "uctools-1.3.0.tar.gz", "has_sig": false, "md5_digest": "b2c12b534fc0234bdf5dfc50950955be", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 4638, "upload_time": "2020-07-14T11:26:59", "upload_time_iso_8601": "2020-07-14T11:26:59.758777Z", "url": "https://files.pythonhosted.org/packages/04/cb/70ed842d9a43460eedaa11f7503b4ab6537b43b63f0d854d59d8e150fac1/uctools-1.3.0.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b2c12b534fc0234bdf5dfc50950955be", "sha256": "6bc0d947215af420534c0f62dcc0272f0b6066ed5545eec88e9a3c2313f6b826" }, "downloads": -1, "filename": "uctools-1.3.0.tar.gz", "has_sig": false, "md5_digest": "b2c12b534fc0234bdf5dfc50950955be", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 4638, "upload_time": "2020-07-14T11:26:59", "upload_time_iso_8601": "2020-07-14T11:26:59.758777Z", "url": "https://files.pythonhosted.org/packages/04/cb/70ed842d9a43460eedaa11f7503b4ab6537b43b63f0d854d59d8e150fac1/uctools-1.3.0.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }