{ "info": { "author": "Said \u00d6zcan", "author_email": "saidozcn@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "Environment :: Web Environment", "Intended Audience :: Developers", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Operating System :: MacOS :: MacOS X", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: Implementation :: PyPy" ], "description": "Preprocessor\n============\n\n.. image:: https://travis-ci.org/s/preprocessor.svg?branch=master\n\nPreprocessor is a preprocessing library for tweet data written in Python.\n\nWhen building Machine Learning systems based on tweet data, a preprocessing is required. This library makes it easy to clean, parse or tokenize the tweets.\n\nFeatures\n========\nCurrently supports cleaning, tokenizing and parsing:\n\n- URLs\n- Hashtags\n- Mentions\n- Reserved words (RT, FAV)\n- Emojis\n- Smileys\n\nSupports Python 2.7 and 3.3+\n\nUsage\n=====\n\nBasic cleaning:\n^^^^^^^^^^^^^^^\n\n.. code-block:: python\n\n >>> import preprocessor as p\n >>> p.clean('Preprocessor is #awesome \ud83d\udc4d https://github.com/s/preprocessor')\n 'Preprocessor is'\n\nTokenizing:\n^^^^^^^^^^^\n\n.. code-block:: python\n\n >>> p.tokenize('Preprocessor is #awesome \ud83d\udc4d https://github.com/s/preprocessor')\n 'Preprocessor is $HASHTAG$ $EMOJI$ $URL$'\n\nParsing:\n^^^^^^^^\n\n.. code-block:: python\n\n >>> parsed_tweet = p.parse('Preprocessor is #awesome https://github.com/s/preprocessor')\n \n >>> parsed_tweet.urls\n [(25:58) => https://github.com/s/preprocessor]\n >>> parsed_tweet.urls[0].start_index\n 25\n >>> parsed_tweet.urls[0].match\n 'https://github.com/s/preprocessor'\n >>> parsed_tweet.urls[0].end_index\n 58\n\nFully customizable:\n^^^^^^^^^^^^^^^^^^^\n\n.. code-block:: python\n\n >>> p.set_options(p.OPT.URL, p.OPT.EMOJI)\n >>> p.clean('Preprocessor is #awesome \ud83d\udc4d https://github.com/s/preprocessor')\n 'Preprocessor is #awesome'\n\nPreprocessor will go through all of the options by default unless you specify some options.\n\nAvailable Options:\n^^^^^^^^^^^^^^^^^^\n============== ======================\nOption Name\t \tOption Short Code\n============== ======================\nURL\t\t \t\t:code:`p.OPT.URL`\nMention \t\t:code:`p.OPT.MENTION`\nHashtag \t\t:code:`p.OPT.HASHTAG`\nReserved Words :code:`p.OPT.RESERVED`\nEmoji\t\t\t:code:`p.OPT.EMOJI`\nSmiley\t\t\t:code:`p.OPT.SMILEY`\nNumber\t\t\t:code:`p.OPT.NUMBER`\n============== ======================\n\n\nInstallation\n===================\nusing pip:\n\n.. code-block:: bash\n\n $ pip install tweet-preprocessor", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/s/preprocessor", "keywords": "machine learning,preprocessing,tweet", "license": "UNKNOWN", "maintainer": null, "maintainer_email": null, "name": "tweet-preprocessor", "package_url": "https://pypi.org/project/tweet-preprocessor/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/tweet-preprocessor/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/s/preprocessor" }, "release_url": "https://pypi.org/project/tweet-preprocessor/0.5.0/", "requires_dist": null, "requires_python": null, "summary": "Elegant tweet preprocessing", "version": "0.5.0" }, "last_serial": 1935944, "releases": { "0.1.2": [ { "comment_text": "", "digests": { "md5": "470ff9a4954b33a0168570be912eb0d4", "sha256": "e6b021a40dbb31dd146436085c448b63b9e22f5a9363c7017ea98b2b1b50b8f2" }, "downloads": -1, "filename": "tweet-preprocessor-0.1.2.tar.gz", "has_sig": false, "md5_digest": "470ff9a4954b33a0168570be912eb0d4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2824, "upload_time": "2016-01-24T09:49:21", "url": "https://files.pythonhosted.org/packages/83/06/5e2bb536ad1a4a7c811009d440d38f1c712e0551e738f4ca929bba284791/tweet-preprocessor-0.1.2.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "81f427391d6b58d4b949798af7252986", "sha256": "931e4dbddc3fd9aee48ca20c93f478022a957616b370c9365b48391ad6cca226" }, "downloads": -1, "filename": "tweet-preprocessor-0.2.0.tar.gz", "has_sig": false, "md5_digest": "81f427391d6b58d4b949798af7252986", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4183, "upload_time": "2016-01-26T10:28:52", "url": "https://files.pythonhosted.org/packages/ad/5a/e05af242ad3470a90bf0f5e0beeab10ecbe966aee47c6ae68bb7e0189989/tweet-preprocessor-0.2.0.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "868d40fdb120a4bb77f70a5be2c266a2", "sha256": "a1d0a7204ab8f7664cc59567fc48ee8d07602b66957acc49fae0a598da3efc28" }, "downloads": -1, "filename": "tweet-preprocessor-0.3.0.tar.gz", "has_sig": false, "md5_digest": "868d40fdb120a4bb77f70a5be2c266a2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4819, "upload_time": "2016-01-27T14:54:45", "url": "https://files.pythonhosted.org/packages/d8/5e/8b6025ef4303f005bf887587d24933a1e91552d4a64e8d1438225630ec03/tweet-preprocessor-0.3.0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "2086fff0480ebac301ce71c464d88f8f", "sha256": "9def7dcc11c212d230b9dfad3607c23f1f7bcef9b684309bd7bca86ce2784e2c" }, "downloads": -1, "filename": "tweet-preprocessor-0.4.0.tar.gz", "has_sig": false, "md5_digest": "2086fff0480ebac301ce71c464d88f8f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5415, "upload_time": "2016-01-31T00:17:08", "url": "https://files.pythonhosted.org/packages/1a/cb/3b0c0b44219eb967d14380c7834f160738221b1048bcb74304d023da9b54/tweet-preprocessor-0.4.0.tar.gz" } ], "0.5.0": [ { "comment_text": "", "digests": { "md5": "6de570130c7146abc327cefe7f3eddb6", "sha256": "994b6ff025d01a6656d2ec9ab55ba93a706147fd8bda639cde5812c126468314" }, "downloads": -1, "filename": "tweet-preprocessor-0.5.0.tar.gz", "has_sig": false, "md5_digest": "6de570130c7146abc327cefe7f3eddb6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6293, "upload_time": "2016-02-02T16:40:24", "url": "https://files.pythonhosted.org/packages/2a/f8/810ec35c31cca89bc4f1a02c14b042b9ec6c19dd21f7ef1876874ef069a6/tweet-preprocessor-0.5.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "6de570130c7146abc327cefe7f3eddb6", "sha256": "994b6ff025d01a6656d2ec9ab55ba93a706147fd8bda639cde5812c126468314" }, "downloads": -1, "filename": "tweet-preprocessor-0.5.0.tar.gz", "has_sig": false, "md5_digest": "6de570130c7146abc327cefe7f3eddb6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6293, "upload_time": "2016-02-02T16:40:24", "url": "https://files.pythonhosted.org/packages/2a/f8/810ec35c31cca89bc4f1a02c14b042b9ec6c19dd21f7ef1876874ef069a6/tweet-preprocessor-0.5.0.tar.gz" } ] }