{ "info": { "author": "Chuancong Gao", "author_email": "chuancong@gmail.com", "bugtrack_url": null, "classifiers": [ "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3" ], "description": "[![PyPI version](https://img.shields.io/pypi/v/TagStats.svg)](https://pypi.python.org/pypi/TagStats/)\n[![PyPI pyversions](https://img.shields.io/pypi/pyversions/TagStats.svg)](https://pypi.python.org/pypi/TagStats/)\n[![PyPI license](https://img.shields.io/pypi/l/TagStats.svg)](https://pypi.python.org/pypi/TagStats/)\n\nA concise yet efficient implementation for computing the statistics of each tag's set of key phrases over input lines in Python 3.\nOne of the major applications is for [sentiment analysis](https://en.wikipedia.org/wiki/Sentiment_analysis), where each tag is a sentiment with the respective key phrases describing the sentiment.\n\n# How it Works\n\nA [trie](https://en.wikipedia.org/wiki/Trie) structure is constructed to index all the key phrases. Then each line is matched towards the index to compute the respective statistics.\nThe time complexity is $O(m^2 \\cdot n)$, where $m$ is the maximum number of words in each line and $n$ is the number of lines.\n\n# Installation\n\nThis package is available on PyPI. Just use `pip3 install -U TagStats` to install it.\n\n# Usage\n\nYou can simply call `compute(content, tagDict)`, where `content` is a list of lines and `tagDict` is a dictionary with each tag name as key and the respective set of key phrases as value.\n\n``` python\nfrom tagstats import compute\n\nprint(compute(\n [\n \"a b c\",\n \"b c\",\n \"a c e\"\n ],\n {\n \"+\": [\"a b\", \"a c\"],\n \"-\": [\"b c\"]\n }\n))\n```\n\nThe output is a dictionary with each tag name as key and the respective totaled frequencies for lines as value.\n\n``` python\n{'+': [1, 0, 1], '-': [1, 1, 0]}\n```\n\nYou can change the default tokenizer, by specifying a compiled regex as separator to `tagstats.tagstats.tokenizer`. You can disable the tokenizer to allow matching over word boundaries, by specifying `None`.\n\nYou can pre-build the index by calling `index(tagDict)`, and later reuse it more than once as an optional parameter of `compute(content, tagDict, index)`. \n\n# Tip\n\nI strongly encourage using PyPy instead of CPython to run the script for best performance.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/chuanconggao/TagStats/tarball/0.1.2", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/chuanconggao/TagStats", "keywords": "sentiment-analysis,string-search", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "TagStats", "package_url": "https://pypi.org/project/TagStats/", "platform": "", "project_url": "https://pypi.org/project/TagStats/", "project_urls": { "Download": "https://github.com/chuanconggao/TagStats/tarball/0.1.2", "Homepage": "https://github.com/chuanconggao/TagStats" }, "release_url": "https://pypi.org/project/TagStats/0.1.2/", "requires_dist": null, "requires_python": ">= 3", "summary": "Statistics for each tag's set of key phrases", "version": "0.1.2" }, "last_serial": 3823038, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "7c123c9bbcd84d8a22ca011770bb9819", "sha256": "ff47677ab7e70c4895e4f17ceb47f3309c39f36a038a2002a8866d3b007ed4b7" }, "downloads": -1, "filename": "TagStats-0.1.tar.gz", "has_sig": false, "md5_digest": "7c123c9bbcd84d8a22ca011770bb9819", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2274, "upload_time": "2018-03-03T10:10:31", "url": "https://files.pythonhosted.org/packages/ed/f8/cd1e8e6629c04be06c8c0882fc0310fbd62da88ddb4749f0704339d498bb/TagStats-0.1.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "05e62077aad8d420a344ab99b5611ed2", "sha256": "c37c28f40a335571faf8e884c5e5f182a91a28fea22c075104b217103ccfcda1" }, "downloads": -1, "filename": "TagStats-0.1.1.tar.gz", "has_sig": false, "md5_digest": "05e62077aad8d420a344ab99b5611ed2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2244, "upload_time": "2018-04-07T00:23:04", "url": "https://files.pythonhosted.org/packages/04/98/08403f9129d6f567e6f677b54fa3e667527c44e512799503f4a229826bef/TagStats-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "dd4deeff4eec6dfcd36ba1d8323f7a15", "sha256": "40115d0e9bb89a2c6b70a2478300f399ce085aa62122171e52bc8e2b7bfb5f3d" }, "downloads": -1, "filename": "TagStats-0.1.2.tar.gz", "has_sig": false, "md5_digest": "dd4deeff4eec6dfcd36ba1d8323f7a15", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 3", "size": 2673, "upload_time": "2018-05-01T06:24:23", "url": "https://files.pythonhosted.org/packages/0a/30/a9fe524890f702b287d923c50fbf6d801297051869c3d8d929234f968825/TagStats-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "dd4deeff4eec6dfcd36ba1d8323f7a15", "sha256": "40115d0e9bb89a2c6b70a2478300f399ce085aa62122171e52bc8e2b7bfb5f3d" }, "downloads": -1, "filename": "TagStats-0.1.2.tar.gz", "has_sig": false, "md5_digest": "dd4deeff4eec6dfcd36ba1d8323f7a15", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 3", "size": 2673, "upload_time": "2018-05-01T06:24:23", "url": "https://files.pythonhosted.org/packages/0a/30/a9fe524890f702b287d923c50fbf6d801297051869c3d8d929234f968825/TagStats-0.1.2.tar.gz" } ] }