{ "info": { "author": "Lucas Shen YS", "author_email": "shen1ys@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6" ], "description": "===============\nLexicalRichness\n===============\n\n\n.. image:: https://img.shields.io/pypi/v/lexicalrichness.svg\n :target: https://pypi.python.org/pypi/lexicalrichness\n\n.. image:: https://readthedocs.org/projects/lexicalrichness/badge/?version=latest\n :target: https://lexicalrichness.readthedocs.io/en/latest/?badge=latest\n :alt: Documentation Status\n\n\nA small python module to compute textual lexical richness measures\n\nInstallation\n------------\n\t\n.. code-block:: bash\n\n\t$ pip install lexicalrichness\n\nQuickstart\n----------\n\n.. code-block:: python\n\n\t>>> from lexicalrichness import LexicalRichness\n\t\n\t# Generate object of readability statistics.\n\t>>> text = \"\"\"Measure of textual lexical diversity, computed as the mean length of sequential words in\n \t\ta text that maintains a minimum threshold TTR score.\n\t\t\n \t\tIterates over words until TTR scores falls below a threshold, then increase factor\n \t\tcounter by 1 and start over. McCarthy and Jarvis (2010, pg. 385) recommends a factor\n \t\tthreshold in the range of [0.660, 0.750].\n \t\t(McCarthy 2005, McCarthy and Jarvis 2010)\"\"\"\n\t\n\t# instantiate new text object (use use_TextBlob=True argument to use the textblob tokenizer)\n\t>>> lex = lexicalrichness(text)\n\t\n\t# Return word count.\n\t>>> lex.words\n\t57\n\t\n\t# Return (unique) term count.\n\t>>> lex.terms\n\t39\n\t\n\t# Return type-token ratio (TTR) of text.\n\t>>> lex.ttr\n\t0.6842105263157895\n\t\n\t# Return root type-token ratio (RTTR) of text.\n\t>>> lex.rttr\n\t5.165676192553671\n\t\n\t# Return corrected type-token ratio (CTTR) of text.\n\t>>> lex.cttr\n\t3.6526846651686067\n\n\t# Return mean segmental type-token ratio (MSTTR).\n\t>>> lex.msttr(segment_window=25)\n\t0.88\n\t\n\t# Return moving average type-token ratio (MATTR).\n\t>>> lex.mattr(window_size=25)\n\t0.8351515151515151\n\t\n\t# Return Measure of Textual Lexical Diversity (MTLD).\n\t>>> lex.mtld(threshold=0.72)\n\t46.79226361031519\n\t\n\t# Return hypergeometric distribution diversity (HD-D) measure.\n\t>>> lex.hdd(draws=42)\n\t0.7468703323966486\n..\t\n\t# Return Herdan's lexical diversity measure.\n\t>>> lex.Herdan\n\t0.9061378160786574\n\t\n\t# Return Summer's lexical diversity measure.\n\t>>> lex.Summer\n\t0.9294460323356605\n\t\n\t# Return Dugast's lexical diversity measure.\n\t>>> lex.Dugast\n\t43.074336212149774\n\t\n\t# Return Maas's lexical diversity measure.\n\t>>> lex.Maas\n\t0.023215679867353005\n\nAttributes and properties\n+++++++++++++++++++++++++\n\n+-------------------------+-----------------------------------------------------------------------------------+ \n| ``wordlist`` | list of words \t\t | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``words`` \t\t | number of words (w) \t\t\t\t \t\t\t | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``terms``\t\t | number of unique terms (t)\t\t\t | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``tokenizer`` | tokenizer used\t\t | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``ttr``\t\t | type-token ratio computed as t / w (Chotlos 1944, Templin 1957) \t |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``rttr``\t | root TTR computed as t / sqrt(w) (Guiraud 1954, 1960) | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``cttr``\t | corrected TTR computed as t / sqrt(2w) (Carrol 1964)\t\t |\t \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``Herdan`` \t | log(t) / log(w) (Herdan 1960, 1964) | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``Summer`` \t | log(log(t)) / log(log(w)) Summer (1966) | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``Dugast`` \t | (log(w) ** 2) / (log(w) - log(t) Dugast (1978)\t\t\t\t | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``Maas`` \t | (log(w) - log(t)) / (log(w) ** 2) Maas (1972) | \n+-------------------------+-----------------------------------------------------------------------------------+\n\nMethods\n+++++++\n\n+-------------------------+-----------------------------------------------------------------------------------+ \n| ``msttr`` \t | Mean segmental TTR (Johnson 1944)\t\t\t\t\t\t | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``mattr`` \t\t | Moving average TTR (Covington 2007, Covington and McFall 2010)\t\t | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``mtld``\t\t | Measure of Lexical Diversity (McCarthy 2005, McCarthy and Jarvis 2010) | \n+-------------------------+-----------------------------------------------------------------------------------+\n| ``hdd`` | HD-D (McCarthy and Jarvis 2007) | \n+-------------------------+-----------------------------------------------------------------------------------+\n\nAssessing method docstrings\n---------------------------\n.. code-block:: python\n\n\t>>> import inspect\n\t\n\t# docstring for hdd (HD-D)\n\t>>> print(inspect.getdoc(LexicalRichness.hdd))\n\t\n\tHypergeometric distribution diversity (HD-D) score.\n\n\tFor each term (t) in the text, compute the probabiltiy (p) of getting at least one appearance\n\tof t with a random draw of size n < N (text size). The contribution of t to the final HD-D\n\tscore is p * (1/n). The final HD-D score thus sums over p * (1/n) with p computed for\n\teach term t. Described in McCarthy and Javis 2007, p.g. 465-466.\n\t(McCarthy and Jarvis 2007)\n\n\tParameters\n\t__________\n\tdraws: int\n\t Number of random draws in the hypergeometric distribution (default=42).\n\n\tReturns\n\t_______\n\tfloat\n\n\n\n=======\nHistory\n=======\n\n0.1.2 (2018-05-09)\n------------------\n\n* First release on PyPI.\n\n0.1.3 (2018-05-27)\n------------------\n\n* Minor fix for compatibility issue with hyphens (ascii) in python 2.\n", "description_content_type": "", "docs_url": null, "download_url": "https://github.com/LSYS/lexicalrichness/archive/0.1.3.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/LSYS/lexicalrichness", "keywords": "lexical diversity", "license": "MIT license", "maintainer": "", "maintainer_email": "", "name": "lexicalrichness", "package_url": "https://pypi.org/project/lexicalrichness/", "platform": "", "project_url": "https://pypi.org/project/lexicalrichness/", "project_urls": { "Download": "https://github.com/LSYS/lexicalrichness/archive/0.1.3.tar.gz", "Homepage": "https://github.com/LSYS/lexicalrichness" }, "release_url": "https://pypi.org/project/lexicalrichness/0.1.3/", "requires_dist": null, "requires_python": "", "summary": "A small module to compute textual lexical richness", "version": "0.1.3" }, "last_serial": 3903639, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "fd4aef0b16a97ae6aca9091c0f8491ae", "sha256": "6a31c65ee5b5fa2f7a115677fbd0ab9a4f22e99f95748be476f806028c083407" }, "downloads": -1, "filename": "lexicalrichness-0.1.0.zip", "has_sig": false, "md5_digest": "fd4aef0b16a97ae6aca9091c0f8491ae", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23187, "upload_time": "2018-05-09T06:09:45", "url": "https://files.pythonhosted.org/packages/ae/c5/e855e3250bc155b79bd30fb5b500833c78d26d186442c63651b54d4366c1/lexicalrichness-0.1.0.zip" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "1b7fb4353482c49e6088f5d578cb92c9", "sha256": "a5536d40985ad79f9d891acb6266ed2d3dfd838fba92bf706faf3f8d2293435f" }, "downloads": -1, "filename": "lexicalrichness-0.1.1.tar.gz", "has_sig": false, "md5_digest": "1b7fb4353482c49e6088f5d578cb92c9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11459, "upload_time": "2018-05-09T06:34:09", "url": "https://files.pythonhosted.org/packages/5c/f2/57554457e97666a9bef85e762688f368c6654a1921afa5cc118ae5a470b4/lexicalrichness-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "625cc6a916560e89513d5a03c247d4d1", "sha256": "5e4f0dbe62eb88b6e4e3d33e71c986d6a05641d69bf79cba9d08a0fa13368815" }, "downloads": -1, "filename": "lexicalrichness-0.1.2.tar.gz", "has_sig": false, "md5_digest": "625cc6a916560e89513d5a03c247d4d1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15404, "upload_time": "2018-05-09T07:14:35", "url": "https://files.pythonhosted.org/packages/6d/6e/c7c7a1eb1e8ef0a9e084bc42500faf6951c63bcc33dd6bb9652d9cedaf58/lexicalrichness-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "f29747daac03d480634a983562586c81", "sha256": "6cd01115dca19b360ca765640810402d992b3e1262b8739f7976b6a2e01f74a2" }, "downloads": -1, "filename": "lexicalrichness-0.1.3.tar.gz", "has_sig": false, "md5_digest": "f29747daac03d480634a983562586c81", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15561, "upload_time": "2018-05-27T16:00:01", "url": "https://files.pythonhosted.org/packages/1f/3a/0e07cec04fea93aa215187a633d7e87b8cba1a7973aec6aedbb9cd269b8c/lexicalrichness-0.1.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f29747daac03d480634a983562586c81", "sha256": "6cd01115dca19b360ca765640810402d992b3e1262b8739f7976b6a2e01f74a2" }, "downloads": -1, "filename": "lexicalrichness-0.1.3.tar.gz", "has_sig": false, "md5_digest": "f29747daac03d480634a983562586c81", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15561, "upload_time": "2018-05-27T16:00:01", "url": "https://files.pythonhosted.org/packages/1f/3a/0e07cec04fea93aa215187a633d7e87b8cba1a7973aec6aedbb9cd269b8c/lexicalrichness-0.1.3.tar.gz" } ] }