{ "info": { "author": "Iman Nazari", "author_email": "imannazari@hotmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "\n# persianutils\n\nA \\[getting] wonderfull package to preprocess your Persian text for Search, Standardizing & NLP processes\n\n\n# Would it help?\n\nPersian has a lot of duplicate characters with Arabic but with different Unicode code points. This may lead to have different writings of a word, with almost exactly the same showing. In addition to that, contextual forms of a character may also be used in text, which doesn't change the word shape but makes the same trouble mentioned above. Unfortunately, a lot of non-standard Persian keyboards don't obey these rules, which makes the problem more severe.\nThis package helps to make your Persian text an standard one, with original Persian characters.\n\n# How to use:\n\nThere are two functions implemented for standardizing Persian text named \"standardize\" or \"standardize4Word2vec\"\n\n```standardize()``` does these:\n\n1. Replace Arabic characters with their Persian equivalent. Like ```from persianutils.ArabicAlphabet import ALEF_MAKSURA``` to ```from persianutils.PersianAlphabet import YE```\n\n2. Remove Tanveens like \u0640\u064d , \u0640\u064e , & etc.\n\n3. Replace contextual forms of a character to it's original form. Like \"\u0640\u062a\u0640\u200e\" to \"\u062a\".\n\n4. Replace western and eastern numerals to their persian equivalent. ```2``` to ```\u06f2```\n\nExample:\n\n```\n\nimport persianutils as pu\nraw_text = \"\u0633\u0644\u0627\u0645\u064c \u0639\u0644\u06cc\u06a9\u0645!\"\nprocessed_text = pu.standardize(raw_text)\nprint(processed_text)\n\n```\n\nThat would result in:\n\n```\n\n\u0633\u0644\u0627\u0645 \u0639\u0644\u06cc\u06a9\u0645!\n\n```\n\n\nstandardize4Word2vec() has these features:\n\n1. Same as the standardize() #1\n\n2. Same as the standardize() #2\n\n3. Same as the standardize() #3\n\n4. Replace all numerals (Eastern, Western and Persian) to their persian writings. ```2``` to ```\u062f\u0648```\n\n5. Replaces all punctuation marks with single space. punctions are: ```[!\"#%\\'()*+,-./:;<=>?@\\[\\]^_`{|}~\u2019\u201d\u201c\u2032\u2018\\\\\\]\u061f\u061b\u00ab\u00bb\u060c\u066a```\n\nExample:\n\n```\n\nimport persianutils as pu\nraw_text = \"\u0633\u0644\u0627\u0645\u064c \u0639\u0644\u06cc\u06a9\u0645!\"\nprocessed_text = pu.standardize4Word2vec(raw_text)\nprint(processed_text)\n\n```\n\nThis would result in:\n\n```\n\n\u0633\u0644\u0627\u0645 \u0639\u0644\u06cc\u06a9\u0645 \n\n```\n\nThere is also a list of Persian & Arabic characters, accessible from ```persianutils.PersianAlphabet```:\n\n```\n\nfrom persianutils.PersianAlphabet import ALEF, BE, PE, TE\n\n```\n\nOr for Arabic:\n\n```\n\nfrom persianutils.ArabicAlphabet import ALEF_HAMZA_ABOVE_FINAL, HAMZA_ABOVE_ALEF\n\n```\n\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ishto7/persianutils", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "persianutils", "package_url": "https://pypi.org/project/persianutils/", "platform": "", "project_url": "https://pypi.org/project/persianutils/", "project_urls": { "Homepage": "https://github.com/ishto7/persianutils" }, "release_url": "https://pypi.org/project/persianutils/0.1.2/", "requires_dist": null, "requires_python": "", "summary": "A [getting] wonderfull package to preprocess your Persian text for Search, Standardizing & NLP processes", "version": "0.1.2" }, "last_serial": 4646128, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "84e59ad6d4477d628a2d49b448d4a10f", "sha256": "d9e87d8e0d389a5a09ba5c43bccee4a62e5e444f853c26ae632d31c14e2633fb" }, "downloads": -1, "filename": "persianutils-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "84e59ad6d4477d628a2d49b448d4a10f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6889, "upload_time": "2018-08-02T18:11:50", "url": "https://files.pythonhosted.org/packages/0a/bf/727ab388690a08c5346a789a4cea1e0996492fde7309c137ddfb90cfe7ca/persianutils-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f69d84d79037fd4c3ca2d775d4c3756b", "sha256": "b3c274edc8b78015038bd8449876b9a41ef7f03f88454b6148aea8c961b11536" }, "downloads": -1, "filename": "persianutils-0.0.2.tar.gz", "has_sig": false, "md5_digest": "f69d84d79037fd4c3ca2d775d4c3756b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5908, "upload_time": "2018-08-02T18:11:51", "url": "https://files.pythonhosted.org/packages/af/3d/a39a60ebb379edcd75161f1b10a9139a084df855198fc1d1005ff791e76c/persianutils-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "32088a11d0b3f8816f96f59d89dd894e", "sha256": "d5d04d52f2f616e61d68e8dac81ba92211a509e277ffbe4254733c049824d3b9" }, "downloads": -1, "filename": "persianutils-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "32088a11d0b3f8816f96f59d89dd894e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6895, "upload_time": "2018-08-02T18:22:22", "url": "https://files.pythonhosted.org/packages/de/fa/0382afaef5d145c2231444339fc6345d41919a97dc2e864eab2137ed7b70/persianutils-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "89d061104075016ecb6b0d1dca4a593f", "sha256": "d87bf5416ca841c655fd6a740a661bf23efd2e37445b114bbb83bfab605ef57a" }, "downloads": -1, "filename": "persianutils-0.0.3.tar.gz", "has_sig": false, "md5_digest": "89d061104075016ecb6b0d1dca4a593f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5912, "upload_time": "2018-08-02T18:22:23", "url": "https://files.pythonhosted.org/packages/ef/de/0e11a4d1619d848bb9ee61ede388bb0976acb3a6a627cbe1b2d5a182a7e6/persianutils-0.0.3.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "30eacd797323ddb146e1cefdc72cb616", "sha256": "cca38fdba6019a9ba53f94313c298a7430ab54226ebb2c26b51f7855a2adf2b5" }, "downloads": -1, "filename": "persianutils-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "30eacd797323ddb146e1cefdc72cb616", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8071, "upload_time": "2018-12-30T11:50:08", "url": "https://files.pythonhosted.org/packages/a6/ef/dbcfc0136aa2d21f946338dbf8293be25e2e2e0241f093ff537c0e0d887f/persianutils-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "96d0fa6bb3388f90b20eacb34117f19b", "sha256": "f399ba783d5bcbb4dc1d4df937a6b7e4a40ede5930627b4f6061f9e36a9c195b" }, "downloads": -1, "filename": "persianutils-0.1.0.tar.gz", "has_sig": false, "md5_digest": "96d0fa6bb3388f90b20eacb34117f19b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5992, "upload_time": "2018-12-30T11:50:10", "url": "https://files.pythonhosted.org/packages/02/60/cbf7b57c4133b0c93606f9eb38152999f04b54d501e9cd5f9c3c055a607a/persianutils-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "420a107a2654ebdefb8e83dc622ab0c1", "sha256": "ae6ab1b1bf443aaa99b3082094b6afb04532bf9b1cf1160077ea6415f68b8a24" }, "downloads": -1, "filename": "persianutils-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "420a107a2654ebdefb8e83dc622ab0c1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9013, "upload_time": "2018-12-30T12:32:27", "url": "https://files.pythonhosted.org/packages/6c/fe/d885747b4e72d4828b838e3141bf1d0a314bae3fa12cdfa5f389ec963680/persianutils-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5242616101beccbf01d97957ef65357c", "sha256": "9f37e1d7269090ab8622ee236cee92247992ab8a549423ab3e7d4f1ca4de62fb" }, "downloads": -1, "filename": "persianutils-0.1.1.tar.gz", "has_sig": false, "md5_digest": "5242616101beccbf01d97957ef65357c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6844, "upload_time": "2018-12-30T12:32:29", "url": "https://files.pythonhosted.org/packages/ae/94/e47fe51c8821a5e31e10a25e7250400c10e1c7130de88cb6fffa3d888742/persianutils-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "6cebbffad23b8c28817dd99932776f1d", "sha256": "90d58c245bd03b691ef6b0a3061d426639eb05b3720b6a6b5fc6d41a48f22cda" }, "downloads": -1, "filename": "persianutils-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "6cebbffad23b8c28817dd99932776f1d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9016, "upload_time": "2018-12-30T12:45:47", "url": "https://files.pythonhosted.org/packages/80/9c/5c382b9f45afc205e6c26cc94ed55b17949f3fa3a827f53571a6796257a4/persianutils-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ba714b017ce768b631f521879a9d45ed", "sha256": "fd58297e6e10c8e650bc7fa3421a7630a86c0a50b37010bd662a5ec9e53d0502" }, "downloads": -1, "filename": "persianutils-0.1.2.tar.gz", "has_sig": false, "md5_digest": "ba714b017ce768b631f521879a9d45ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6852, "upload_time": "2018-12-30T12:45:51", "url": "https://files.pythonhosted.org/packages/69/2c/e11d82dfe3664e1ca16b90c192fdfd864f6c03e5738b4339572cbdf3b081/persianutils-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "6cebbffad23b8c28817dd99932776f1d", "sha256": "90d58c245bd03b691ef6b0a3061d426639eb05b3720b6a6b5fc6d41a48f22cda" }, "downloads": -1, "filename": "persianutils-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "6cebbffad23b8c28817dd99932776f1d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9016, "upload_time": "2018-12-30T12:45:47", "url": "https://files.pythonhosted.org/packages/80/9c/5c382b9f45afc205e6c26cc94ed55b17949f3fa3a827f53571a6796257a4/persianutils-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ba714b017ce768b631f521879a9d45ed", "sha256": "fd58297e6e10c8e650bc7fa3421a7630a86c0a50b37010bd662a5ec9e53d0502" }, "downloads": -1, "filename": "persianutils-0.1.2.tar.gz", "has_sig": false, "md5_digest": "ba714b017ce768b631f521879a9d45ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6852, "upload_time": "2018-12-30T12:45:51", "url": "https://files.pythonhosted.org/packages/69/2c/e11d82dfe3664e1ca16b90c192fdfd864f6c03e5738b4339572cbdf3b081/persianutils-0.1.2.tar.gz" } ] }