{ "info": { "author": "JBMountford", "author_email": "jbm112358@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Saniti\n\n**Sanitise lists of text documents quickly, easily and whilst maintaining your sanity**\n\nThe aim was to streamline processing lists of documents into the same outputs into simply specifying the list of texts and defining the sanitization pipeline.\n\n### Usage:\n\n**As a function-ish**\n\n```\nimport saniti\noriginal_text = [\"I like to moves it, move its\", \"I likeing to move it!\", \"the of\"]\ntext = saniti.saniti(original_text, [\"token\", \"destop\", \"depunct\", \"unempty\", \"stem\", \"out_corp_dict\"]) #sanitise the text while initalising the class\nprint(text.text)\n\n{'dictionary': , 'corpus': [[(0, 1), (1, 1), (2, 2)], [(0, 1), (1, 1), (2, 1)], []]}\n```\n\n**As a class**\n\n```\nimport saniti\nsani1 = saniti.saniti() # initialise the santising class\ntext = sani1.process(original_text, [\"token\", \"destop\", \"depunct\", \"unempty\", \"lemma\", \"out_tag_doc\"]) # sanitise the text\nprint(text)\n\n[TaggedDocument(words=['I', 'like', 'move', 'move'], tags=['I like move move']), TaggedDocument(words=['I', 'likeing', 'move'], tags=['I likeing move']), TaggedDocument(words=[], tags=[''])]\n```\n\n## Pipeline Components\n\n* \"token\" - tokenise texts\n* \"depunct\" - remove punctuation\n* \"unempty\" - remove empty words within documents\n* \"lemma\" - lemmatize text\n* \"destop\" - remove stopwords\n* \"stem\" - stem texts\n* \"out_tag_doc\" - turns the texts into gensim tagged documents for Doc2Vec\n* \"out_corp_dict\" - turns the texts into gensim corpus and dictionary", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ChamRoshi/Saniti", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "saniti", "package_url": "https://pypi.org/project/saniti/", "platform": "", "project_url": "https://pypi.org/project/saniti/", "project_urls": { "Homepage": "https://github.com/ChamRoshi/Saniti" }, "release_url": "https://pypi.org/project/saniti/0.1.51/", "requires_dist": null, "requires_python": "", "summary": "Sanitise text while keeping your sanity", "version": "0.1.51" }, "last_serial": 3976101, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "c9f24d21c4bdd92b43d3fae87e665b19", "sha256": "7e3cf635fb4c7df87fb7fd687128411bab6081d39a99c1569c72e9180a6c3dc9" }, "downloads": -1, "filename": "saniti-0.0.1.tar.gz", "has_sig": false, "md5_digest": "c9f24d21c4bdd92b43d3fae87e665b19", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2317, "upload_time": "2018-06-18T05:52:30", "url": "https://files.pythonhosted.org/packages/a6/38/3ed9f0939d03ecfd7fc3eec79d0109b09e50206b30d1b53bf35e14dd76c4/saniti-0.0.1.tar.gz" } ], "0.0.11": [ { "comment_text": "", "digests": { "md5": "bca25c759fafabf8214ddc97046c10ee", "sha256": "1ac577e3478f5a29eb61d3b6d43cdaf4816a5f436c69c667f64815e4082a0bba" }, "downloads": -1, "filename": "saniti-0.0.11.tar.gz", "has_sig": false, "md5_digest": "bca25c759fafabf8214ddc97046c10ee", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2313, "upload_time": "2018-06-18T05:52:32", "url": "https://files.pythonhosted.org/packages/3d/eb/99207d7e7c98c49501ab9d0a6c3d42ca2bd0fcfce0fe948c325ec7cf0833/saniti-0.0.11.tar.gz" } ], "0.0.12": [ { "comment_text": "", "digests": { "md5": "5802dbbcbf70c76b4dc361e12413ae6d", "sha256": "4058d31c489715ee0bddbf6160e9597f08cf91a44d23963adb5c92001f6d9979" }, "downloads": -1, "filename": "saniti-0.0.12.tar.gz", "has_sig": false, "md5_digest": "5802dbbcbf70c76b4dc361e12413ae6d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2638, "upload_time": "2018-06-18T05:52:33", "url": "https://files.pythonhosted.org/packages/7d/e5/b584f427708cd062392b95a4f6949d1c0b21a3c860bf069de81c38e9f6c7/saniti-0.0.12.tar.gz" } ], "0.0.13": [ { "comment_text": "", "digests": { "md5": "96a439fd451d209502d4d46037cd5f86", "sha256": "1eb13afd17266a2605ee473edd1acbcb8c670d1aa0d7f37abea9dc13d81a0824" }, "downloads": -1, "filename": "saniti-0.0.13.tar.gz", "has_sig": false, "md5_digest": "96a439fd451d209502d4d46037cd5f86", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2634, "upload_time": "2018-06-18T05:52:34", "url": "https://files.pythonhosted.org/packages/eb/9c/59099504eccf48181ce3ca8cab8bedc0321a4d3b7c93aaaf2b072eec5cbd/saniti-0.0.13.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "584701634fd06c81b50dd502233fd441", "sha256": "d8744cbfa04847194ea03013266464e6723fbe202d33f87245d5dab5da0a8677" }, "downloads": -1, "filename": "saniti-0.1.0.tar.gz", "has_sig": false, "md5_digest": "584701634fd06c81b50dd502233fd441", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2634, "upload_time": "2018-05-31T20:28:15", "url": "https://files.pythonhosted.org/packages/72/5a/02a07ff37defe30796a5296155c001b552f373788590a69059761de4a2dc/saniti-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "7ee6d4301d269829b58aa341903cbfda", "sha256": "cfbc1b96dd13bf96beacda66e3be98d46e427b02cc4e55154c382b8538356d9b" }, "downloads": -1, "filename": "saniti-0.1.1.tar.gz", "has_sig": false, "md5_digest": "7ee6d4301d269829b58aa341903cbfda", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2629, "upload_time": "2018-05-31T20:30:47", "url": "https://files.pythonhosted.org/packages/8f/ec/25f4e577b883021ef5e1675e8f9b45c03e542921ed649f6e510516ca523c/saniti-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "eb548c2e7875a8e9e3617fc3ee004e56", "sha256": "46e7b638f09d7a3815e2e37f3ebb1c401d035f2de4b392cf1d13b5e790c7ce45" }, "downloads": -1, "filename": "saniti-0.1.2.tar.gz", "has_sig": false, "md5_digest": "eb548c2e7875a8e9e3617fc3ee004e56", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2720, "upload_time": "2018-06-01T12:34:40", "url": "https://files.pythonhosted.org/packages/19/5f/0fe00fb972f9e501af4ae7826838fb36a86b79bb273529a5813c47f2badb/saniti-0.1.2.tar.gz" } ], "0.1.21": [ { "comment_text": "", "digests": { "md5": "c4835a146fb2313e1ece522f8d46f119", "sha256": "d20090af50cd0586a9259967cfc7a762c6c811efc7da36a9aa51f58f17294d58" }, "downloads": -1, "filename": "saniti-0.1.21.tar.gz", "has_sig": false, "md5_digest": "c4835a146fb2313e1ece522f8d46f119", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2733, "upload_time": "2018-06-01T16:59:01", "url": "https://files.pythonhosted.org/packages/f5/cd/3bc3dd56d0c7df695d421545ee14dc6a889c291f1d8a0f32c3934b0e356e/saniti-0.1.21.tar.gz" } ], "0.1.22": [ { "comment_text": "", "digests": { "md5": "555d7ec1218b58ece83dc67664654816", "sha256": "0eb608bc5256ede73061083904c1b23b471ccb339008fd8cb79d6818ed686bac" }, "downloads": -1, "filename": "saniti-0.1.22.tar.gz", "has_sig": false, "md5_digest": "555d7ec1218b58ece83dc67664654816", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2747, "upload_time": "2018-06-01T17:07:53", "url": "https://files.pythonhosted.org/packages/26/a0/f6b43f17995937b56eead7848acaeb3a58b7ffe68e0a855391ecb21b0984/saniti-0.1.22.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "d6af3bccba7f48c0559eb7b8b49b8a32", "sha256": "46be2791e9241cdd24a90eb3748833daf27409b405bb3030170f4616bcba38fb" }, "downloads": -1, "filename": "saniti-0.1.3.tar.gz", "has_sig": false, "md5_digest": "d6af3bccba7f48c0559eb7b8b49b8a32", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2847, "upload_time": "2018-06-03T05:36:09", "url": "https://files.pythonhosted.org/packages/c5/26/92ce4a612396b2ce659e4cc6ef80a33c41acb6e04371dc5124d8fb2b3e47/saniti-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "876d426dbbeaf43958da1d0c5f114f22", "sha256": "f908d4706dc78e1a03e48e55043f82e0b43c5bf175b2079dcb72b97321009f86" }, "downloads": -1, "filename": "saniti-0.1.4.tar.gz", "has_sig": false, "md5_digest": "876d426dbbeaf43958da1d0c5f114f22", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3010, "upload_time": "2018-06-13T05:56:49", "url": "https://files.pythonhosted.org/packages/33/51/90d0cb8a0bbc99b9bb2a4165ff9046ebc1a31465bd6bd4b6d35ec5d67479/saniti-0.1.4.tar.gz" } ], "0.1.41": [ { "comment_text": "", "digests": { "md5": "c968bad8fdd5389845bd8143ccd484e5", "sha256": "a98808189256ad27bfb869e6d09f37ad97eae2230e4494b30b891c36569fced0" }, "downloads": -1, "filename": "saniti-0.1.41.tar.gz", "has_sig": false, "md5_digest": "c968bad8fdd5389845bd8143ccd484e5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3010, "upload_time": "2018-06-17T16:48:28", "url": "https://files.pythonhosted.org/packages/70/c6/67d6101671b4c5cfb27b71cdb24eb8f480c23f55f98af2092beb86a25129/saniti-0.1.41.tar.gz" } ], "0.1.42": [ { "comment_text": "", "digests": { "md5": "f00607096b7c14a571d4fdb99e32521c", "sha256": "e9ba75315230100e07eb561bda705d2919f251432f404264a57cb402b2ee9c9d" }, "downloads": -1, "filename": "saniti-0.1.42.tar.gz", "has_sig": false, "md5_digest": "f00607096b7c14a571d4fdb99e32521c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3009, "upload_time": "2018-06-17T17:01:59", "url": "https://files.pythonhosted.org/packages/b8/fc/71f6b5ce72ebe250f8409a6f7f6691b1c6da4c547020af6a0eae2cb38aa8/saniti-0.1.42.tar.gz" } ], "0.1.43": [ { "comment_text": "", "digests": { "md5": "cd344d8499bc7c960ddc8574e89b02b1", "sha256": "c2277439cc63693f85e306d406f64f08a45403ba35393412b09e2c8d5edaae33" }, "downloads": -1, "filename": "saniti-0.1.43.tar.gz", "has_sig": false, "md5_digest": "cd344d8499bc7c960ddc8574e89b02b1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3009, "upload_time": "2018-06-17T19:33:15", "url": "https://files.pythonhosted.org/packages/99/9f/3bfc4febd1309b0332b343b87190e7b6afbf018a84a6caf525b71b3457bd/saniti-0.1.43.tar.gz" } ], "0.1.44": [ { "comment_text": "", "digests": { "md5": "9956ae2fa96fa83364c30e3a38255a81", "sha256": "45734066ce0ca047dbeec0c650ab23a1a661139466e95639cd7f27664b0ba042" }, "downloads": -1, "filename": "saniti-0.1.44.tar.gz", "has_sig": false, "md5_digest": "9956ae2fa96fa83364c30e3a38255a81", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3141, "upload_time": "2018-06-18T05:37:27", "url": "https://files.pythonhosted.org/packages/57/27/81599eecd9e2a9339d3d14d0534ba522528f94c1d6e35445903210301d09/saniti-0.1.44.tar.gz" } ], "0.1.51": [ { "comment_text": "", "digests": { "md5": "5065011ce3d617152812be09513f661a", "sha256": "ae890077f4ede610cda3b63013199467e17d1d840047c1731d5692cd83e2f863" }, "downloads": -1, "filename": "saniti-0.1.51.tar.gz", "has_sig": false, "md5_digest": "5065011ce3d617152812be09513f661a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3135, "upload_time": "2018-06-19T05:47:38", "url": "https://files.pythonhosted.org/packages/5f/28/2b23e2d8a83dbe4a9260859988ddacbce2c6a553c89358e54b64b9722a25/saniti-0.1.51.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5065011ce3d617152812be09513f661a", "sha256": "ae890077f4ede610cda3b63013199467e17d1d840047c1731d5692cd83e2f863" }, "downloads": -1, "filename": "saniti-0.1.51.tar.gz", "has_sig": false, "md5_digest": "5065011ce3d617152812be09513f661a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3135, "upload_time": "2018-06-19T05:47:38", "url": "https://files.pythonhosted.org/packages/5f/28/2b23e2d8a83dbe4a9260859988ddacbce2c6a553c89358e54b64b9722a25/saniti-0.1.51.tar.gz" } ] }