{ "info": { "author": "Jonathan Raiman", "author_email": "jonathanraiman@gmail.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Science/Research", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.3", "Topic :: Text Processing :: Linguistic" ], "description": "Ciseau\n------\n\nWord and sentence tokenization in Python.\n\n[![PyPI version](https://badge.fury.io/py/ciseau.svg)](https://badge.fury.io/py/ciseau)\n[![Build Status](https://travis-ci.org/JonathanRaiman/ciseau.svg?branch=master)](https://travis-ci.org/JonathanRaiman/ciseau)\n![Jonathan Raiman, author](https://img.shields.io/badge/Author-Jonathan%20Raiman%20-blue.svg)\n\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE.md)\n\n\nUsage\n-----\n\nUse this package to split up strings according to sentence and word boundaries.\nFor instance, to simply break up strings into tokens:\n\n```\ntokenize(\"Joey was a great sailor.\")\n#=> [\"Joey \", \"was \", \"a \", \"great \", \"sailor \", \".\"]\n```\n\nTo also detect sentence boundaries:\n\n```\nsent_tokenize(\"Cat sat mat. Cat's named Cool.\", keep_whitespace=True)\n#=> [[\"Cat \", \"sat \", \"mat\", \". \"], [\"Cat \", \"'s \", \"named \", \"Cool\", \".\"]]\n```\n\n`sent_tokenize` can keep the whitespace as-is with the flags `keep_whitespace=True` and `normalize_ascii=False`.\n\nInstallation\n------------\n\n```\npip3 install ciseau\n```\n\nTesting\n-------\n\nRun `nose2`.\n\n\nIf you find this project useful for your work or research, here's how you can cite it:\n\n```latex\n@misc{RaimanCiseau2017,\n author = {Raiman, Jonathan},\n title = {Ciseau},\n year = {2017},\n publisher = {GitHub},\n journal = {GitHub repository},\n howpublished = {\\url{https://github.com/jonathanraiman/ciseau}},\n commit = {fe88b9d7f131b88bcdd2ff361df60b6d1cc64c04}\n}\n```", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/JonathanRaiman/ciseau", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/JonathanRaiman/ciseau", "keywords": "XML,tokenization,NLP", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "ciseau", "package_url": "https://pypi.org/project/ciseau/", "platform": "any", "project_url": "https://pypi.org/project/ciseau/", "project_urls": { "Download": "https://github.com/JonathanRaiman/ciseau", "Homepage": "https://github.com/JonathanRaiman/ciseau" }, "release_url": "https://pypi.org/project/ciseau/1.0.1/", "requires_dist": null, "requires_python": "", "summary": "Word and sentence tokenization.", "version": "1.0.1" }, "last_serial": 3480009, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "5c88842d7d36831f4c35597f6399e169", "sha256": "4df74a336ab0b7a650bba67ef560e2600d8f965f837b55ff464865978e159bd2" }, "downloads": -1, "filename": "ciseau-1.0.0.tar.gz", "has_sig": false, "md5_digest": "5c88842d7d36831f4c35597f6399e169", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10102, "upload_time": "2017-04-11T18:30:15", "url": "https://files.pythonhosted.org/packages/85/ce/ffa2a138fd3ec5e1d7ad1e42c3a7925a8fca61a4a44fd6286dc776da8ccc/ciseau-1.0.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "c4083802e6ffc1179e09640851871b83", "sha256": "a316b9131f48dda54ea41dae25fc4adead04a3050c52c1ce2c0936a94f78e3ad" }, "downloads": -1, "filename": "ciseau-1.0.1.tar.gz", "has_sig": false, "md5_digest": "c4083802e6ffc1179e09640851871b83", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10344, "upload_time": "2018-01-11T07:27:14", "url": "https://files.pythonhosted.org/packages/0b/be/2ba2d3a6dbffc69797471c7e691153c40949879f1719264a8f3b16271ff2/ciseau-1.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c4083802e6ffc1179e09640851871b83", "sha256": "a316b9131f48dda54ea41dae25fc4adead04a3050c52c1ce2c0936a94f78e3ad" }, "downloads": -1, "filename": "ciseau-1.0.1.tar.gz", "has_sig": false, "md5_digest": "c4083802e6ffc1179e09640851871b83", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10344, "upload_time": "2018-01-11T07:27:14", "url": "https://files.pythonhosted.org/packages/0b/be/2ba2d3a6dbffc69797471c7e691153c40949879f1719264a8f3b16271ff2/ciseau-1.0.1.tar.gz" } ] }