{ "info": { "author": "Joshua Crowgey", "author_email": "jcrowgey@uw.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Environment :: Console", "Programming Language :: Python :: 3.5" ], "description": "neolo\n=====\n\nText Analysis Software for Saulo Brand\u00e3o. Developed by Joshua Crowgey\nin summer 2014.\n\n```\nusage: neolo [-h] [--dicts DICT [DICT ...]] [--mltd] [--msttr] [--hdd]\n [--verbose] [--wordlen] [--wordtypes] [--hapax] [--punc-ratio]\n [--no-hyphen] [--no-apostrophe] [--sents [ABBREV]]\n [--stemming LANGUAGE]\n TEXT\n\nExtract lexical statistics from a text file.\n\npositional arguments:\n TEXT the text you want to investigate\n\noptional arguments:\n -h, --help show this help message and exit\n --dicts DICT [DICT ...]\n a list of reference texts to compute neologism count\n --mltd measure of lexical textual diversity\n --msttr mean segmental type-token ratio\n --hdd HD-D probabilistic TTR\n --verbose, -v increase the verbosty (can be repeated: -vvv)\n --wordlen, -w print the distribution of words by length\n --wordtypes, -t print the distribution of wordtypes (unigrams) by\n count\n --hapax, -x print the list of hapax legomena\n --punc-ratio, -p print the ratio of punctuation tokens out of total\n tokens\n --no-hyphen, -y remove the hyphen (-) from the list of punctuation\n symbols used in tokenization\n --no-apostrophe, -a remove the apostrophe (') from the list of punctuation\n symbols used in tokenization\n --sents [ABBREV], -s [ABBREV]\n print sentence length statistics, uses an (optional)\n abbreviations file containing stings which don't end\n sentences (eg: Mr.). One abbreviaion per line, include\n relevant punctuation. Note that items in the\n abbreviations file will also be protected during later\n tokenization.\n --stemming LANGUAGE, -m LANGUAGE\n stem words using NLTK prior to processing them\n```\n\nNeologism Count\n---------------\nThe name of this program reflects this original functionality. Neologism\ncount is computed by referencing known wordlists or dictionaries. Word types\nfound in the text under consideration which are not found in the reference \ndictionaries/wordlists are considered neologisms.\n\nTo show a simple example, suppose you have a text file called mary.txt \nwhich contains the following traditional poem:\n\n```\nMary had a little lamb,\nHer fleece was white as snow.\nEverywhere that mary went,\nthe lamb was sure to go.\n```\n\nSupposing you're using the debian distro of GNU/Linux, there is a list of \nEnglish words stored in /usr/share/dict/words that you can use as a \nreference. You can ask neolo to check mary.txt for neologisms using \nthe --dicts option. The --dicts option takes a list of one ore more filenames\nto use as references in calculating neologisms.\n\n```\nuser@computer:~/src/neolo$ ./neolo texts/mary.txt --dicts /usr/share/dict/words\nOpening texts/mary.txt with encoding: utf-8 \nTokenizing, downcasing, stemming text: texts/mary.txt ... done.\nCounting and sorting words in text: texts/mary.txt ...done.\nOpening /usr/share/dict/words with encoding: utf-8 \nTokenizing, downcasing, stemming dict files: ['/usr/share/dict/words'] ... done.\nCounting and sorting words in dictonaries: ['/usr/share/dict/words'] ...done.\nNeologism list:\n\nStatistics:\n-----------\nText size: 21 tokens in 18 types.\nNumber of hapax legomena: 15\nTTR (type-token ratio): 0.8571428571428571\nHTR (hapax-token ratio): 0.7142857142857143\nHTyR (hapax-type ratio): 0.8333333333333334\nNeologisms: 0 types not found in 1 dictionaries\nDictionaries contained 234937 tokens in 233615 types.\n```\n\nAs you can see, there are no words in mary.txt which aren't in the reference\nwordlist file, so neolo says \"Neolgisms: 0 types not found in 1 dictionaries\".\n\nHowever, if you edit mary.txt such that instead of fleece, the poem's second\nline says ``Her pleece was white as snow.'', now neolo prints a neologism list\nalong with its regular output.\n\n```\nuser@computer:~/src/neolo$ ./neolo texts/mary.txt --dicts /usr/share/dict/words\nOpening texts/mary.txt with encoding: utf-8 \nTokenizing, downcasing, stemming text: texts/mary.txt ... done.\nCounting and sorting words in text: texts/mary.txt ...done.\nOpening /usr/share/dict/words with encoding: utf-8 \nTokenizing, downcasing, stemming dict files: ['/usr/share/dict/words'] ... done.\nCounting and sorting words in dictonaries: ['/usr/share/dict/words'] ...done.\nNeologism list:\npleece\n\nStatistics:\n-----------\nText size: 21 tokens in 18 types.\nNumber of hapax legomena: 15\nTTR (type-token ratio): 0.8571428571428571\nHTR (hapax-token ratio): 0.7142857142857143\nHTyR (hapax-type ratio): 0.8333333333333334\nNeologisms: 1 types not found in 1 dictionaries\nDictionaries contained 234937 tokens in 233615 types.\n```\n\nMLTD\n----\n\nMSTTR\n-----\n\nHD-D\n----\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/jcrowgey/neolo", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "neolo", "package_url": "https://pypi.org/project/neolo/", "platform": "", "project_url": "https://pypi.org/project/neolo/", "project_urls": { "Homepage": "https://github.com/jcrowgey/neolo" }, "release_url": "https://pypi.org/project/neolo/0.1.2/", "requires_dist": null, "requires_python": ">=3.0", "summary": "Text Analysis Software", "version": "0.1.2" }, "last_serial": 5169000, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "98494d913345d39020f1ac314b9bc393", "sha256": "c1e52e21f02a3ff382eabba54d452ac5c103387c8f23c3e00377e2107a6310a8" }, "downloads": -1, "filename": "neolo-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "98494d913345d39020f1ac314b9bc393", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.0", "size": 9325, "upload_time": "2019-03-27T03:57:19", "url": "https://files.pythonhosted.org/packages/b5/55/ba642becc953eda915b70aa05d2b79b23fb6b790e2093c4edca9667eac08/neolo-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b9e42da31c9796f33cafce56360938c9", "sha256": "ce9d175c20f0a1de4f3fc3c0ebbab11594c313c66010ea3fe577adfbc71177f2" }, "downloads": -1, "filename": "neolo-0.1.0.tar.gz", "has_sig": false, "md5_digest": "b9e42da31c9796f33cafce56360938c9", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.0", "size": 8028, "upload_time": "2019-03-27T03:57:21", "url": "https://files.pythonhosted.org/packages/e0/e6/188e524b8aa79145606185d86db3459905ac6cc1b261f818ffe7b675ab19/neolo-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "42fafa701b642ce048289c1d06cb3e86", "sha256": "2471e73f7b2f751599b8da771bcd35d96b2b09d9c704f3b582a5b16238546ead" }, "downloads": -1, "filename": "neolo-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "42fafa701b642ce048289c1d06cb3e86", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.0", "size": 9890, "upload_time": "2019-04-19T04:54:12", "url": "https://files.pythonhosted.org/packages/b8/67/58c556c41304734f3c7b52d1f8ca1f241932e3c02e17d11dd506dd422872/neolo-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c007a9a21e492be51fc5cda77765d447", "sha256": "124845e0ce3f0a81b1e003ffc8f3208b46ba64652c204627624deb7b90bfca43" }, "downloads": -1, "filename": "neolo-0.1.1.tar.gz", "has_sig": false, "md5_digest": "c007a9a21e492be51fc5cda77765d447", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.0", "size": 8436, "upload_time": "2019-04-19T04:54:14", "url": "https://files.pythonhosted.org/packages/a4/aa/a5303e9afa96439aa11c6ab0e9180c46054c5a5c5941532be4a403d607fc/neolo-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "e8fcdab6dfb199f2bfead7a4ffbc6519", "sha256": "df670c6629f40862554225dcd17916bfcd108a3a95492079a5d7aaba66f3709b" }, "downloads": -1, "filename": "neolo-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "e8fcdab6dfb199f2bfead7a4ffbc6519", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.0", "size": 10012, "upload_time": "2019-04-21T04:08:45", "url": "https://files.pythonhosted.org/packages/4a/99/9778f8d5d442fb1bb09177c74f149fb3a8589437ecbd16e663eba39c7411/neolo-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8ab7eacbb5a69eeaf2104368df9a768e", "sha256": "f630924470a39f8037cb3ac0a4123078c73c8e08fc1d19171f81a4c209bc57ec" }, "downloads": -1, "filename": "neolo-0.1.2.tar.gz", "has_sig": false, "md5_digest": "8ab7eacbb5a69eeaf2104368df9a768e", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.0", "size": 9098, "upload_time": "2019-04-21T04:08:47", "url": "https://files.pythonhosted.org/packages/73/78/a7ce957eb45fe01a0e28b0bd03774df60289ca9524ab620ef0c0a5e9c124/neolo-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e8fcdab6dfb199f2bfead7a4ffbc6519", "sha256": "df670c6629f40862554225dcd17916bfcd108a3a95492079a5d7aaba66f3709b" }, "downloads": -1, "filename": "neolo-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "e8fcdab6dfb199f2bfead7a4ffbc6519", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.0", "size": 10012, "upload_time": "2019-04-21T04:08:45", "url": "https://files.pythonhosted.org/packages/4a/99/9778f8d5d442fb1bb09177c74f149fb3a8589437ecbd16e663eba39c7411/neolo-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8ab7eacbb5a69eeaf2104368df9a768e", "sha256": "f630924470a39f8037cb3ac0a4123078c73c8e08fc1d19171f81a4c209bc57ec" }, "downloads": -1, "filename": "neolo-0.1.2.tar.gz", "has_sig": false, "md5_digest": "8ab7eacbb5a69eeaf2104368df9a768e", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.0", "size": 9098, "upload_time": "2019-04-21T04:08:47", "url": "https://files.pythonhosted.org/packages/73/78/a7ce957eb45fe01a0e28b0bd03774df60289ca9524ab620ef0c0a5e9c124/neolo-0.1.2.tar.gz" } ] }