{ "info": { "author": "Allison Parrish", "author_email": "allison@decontextualize.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Topic :: Artistic Software", "Topic :: Scientific/Engineering :: Artificial Intelligence" ], "description": "pycorpora\n=========\n\n.. image:: https://travis-ci.org/aparrish/pycorpora.svg?branch=master\n :target: https://travis-ci.org/aparrish/pycorpora\n\nA simple Python interface for Darius Kazemi's `Corpora Project\n`_, \"a collection of static corpora\n(plural of 'corpus') that are potentially useful in the creation of weird\ninternet stuff.\" The ``pycorpora`` interface makes it easy to use data from the\nCorpora Project in your program. Here's an example of how it works::\n\n import pycorpora\n import random\n\n # print a random flower name\n print random.choice(pycorpora.plants.flowers['flowers'])\n\n # print a random word coined by Shakespeare\n print random.choice(pycorpora.words.literature.shakespeare_words['words'])\n\n`Allison Parrish `_ created the ``pycorpora`` \ninterface. Python 3 is not yet supported. The source code for the package is `on GitHub\n`_. Contributions are welcome!\n\nInstallation\n------------\n\nInstallation by hand::\n\n python setup.py install\n\nInstallation with pip::\n\n pip install pycorpora\n\nThe package does not include data from the Corpora Project; instead, the data\nis downloaded when the package is installed (using either of the methods\nabove). By default, the \"master\" branch of the `Corpora Project GitHub\nrepository `_ is used as the source for the\ndata. You can specify an alternative URL to download the data from using the\nargument ``--corpora-zip-url`` on the command line with either of the two\nmethods above::\n\n python setup.py install --corpora-zip-url=http://example.com/corpora.zip\n\n... or, with ``pip``::\n\n pip install pycorpora --install-option=\"--corpora-zip-url=http://example.com/corpora.zip\"\n\n(The intention of ``--corpora-zip-url`` is to let you install Corpora Project\ndata from a particular branch, commit or fork, so that changes to the bleeding\nedge of the project don't break your code.)\n\nUpdate\n------\n\nUpdate Corpora Project data by reinstalling with pip:\n\n pip install --upgrade --force-reinstall pycorpora\n\nUsage\n-----\n\nGetting the data from a particular Corpora Project file is easy. Here's an\nexample::\n\n import pycorpora\n crayola_data = pycorpora.colors.crayola\n print crayola_data[\"colors\"][0][\"color\"] # prints \"Almond\"\n\nThe expression ``pycorpora.colors.crayola`` returns data deserialized from the\nJSON file located at ``data/colors/crayola.json`` in the Corpora Project (i.e.,\n`this file\n`_).\nYou can use this syntax even with more deeply nested subdirectories::\n\n import pycorpora\n mr_men_little_miss_data = pycorpora.words.literature.mr_men_little_miss\n print mr_men_little_miss_data[\"little_miss\"][-1] # prints \"Wise\"\n\nYou can use ``from pycorpora import ...`` to import a particular Corpora Project\ncategory::\n\n from pycorpora import governments\n print governments.nsa_projects[\"codenames\"][0] # prints \"ARTIFICE\"\n\n from pycorpora import humans\n print humans.occupations[\"occupations\"][0] # prints \"accountant\"\n\nYou can also use square bracket indexing instead of attributes for accessing\nsubcategories and individual corpora (just in case the Corpora Project ever adds\nfiles with names that aren't valid Python identifiers)::\n\n import pycorpora\n import random\n fruits = pycorpora.foods[\"fruits\"]\n print random.choice(fruits[\"fruits\"]) # prints \"pomelo\" maybe\n\nAdditionally, ``pycorpora`` supports an API similar to that provided by the `Corpora Project node package `_::\n\n import pycorpora\n\n # get a list of all categories\n pycorpora.get_categories() # [\"animals\", \"archetypes\"...]\n\n # get a list of subcategories for a particular category\n pycorpora.get_categories(\"words\") # [\"literature\", \"word_clues\"...]\n\n # get a list of all files in a particular category\n pycorpora.get_files(\"animals\") # [\"birds_antarctica\", \"birds_uk\", ...]\n\n # get data deserialized from the JSON data in a particular file\n pycorpora.get_file(\"animals\", \"birds_antarctica\") # returns dict w/data\n\n # get file in a subcategory\n pycorpora.get_file(\"words/literature\", \"shakespeare_words\")\n\nAs an extension of this interface, you can also use the ``get_categories``,\n``get_files`` and ``get_file`` methods on individual categories::\n\n import pycorpora\n\n # get a list of files in the \"archetypes\" category\n pycorpora.archetypes.get_files() # ['artifact', 'character', 'event', ...]\n\n # get an individual file from the \"archetypes\" category\n pycorpora.archetypes.get_file(\"character\") # returns dictionary w/data\n\n # get subcategories of a category\n pycorpora.words.get_categories() # ['literature', 'word_clues']\n\nExamples\n--------\n\nHere are a few quick examples of using data from the Corpora Project to do\nweird and fun stuff.\n\nCreate a list of whimsically colored flowers::\n\n from pycorpora import plants, colors\n import random\n\n random_flowers = random.sample(plants.flowers[\"flowers\"], 10)\n random_colors = random.sample(\n [item['color'] for item in colors.crayola[\"colors\"]], 10)\n for pair in zip(random_colors, random_flowers):\n print \" \".join(pair).title()\n\n # outputs (e.g.):\n # Maroon Bergamot\n # Blue Bell Zinnia\n # Pink Flamingo Camellias\n # Tickle Me Pink Begonia\n # Burnt Orange Clover\n # Fuzzy Wuzzy Hibiscus\n # Outer Space Forget Me Not\n # Almond Petunia\n # Pine Green Ladys Slipper\n # Shadow Jasmine\n\nCreate random biographies::\n\n from pycorpora import humans, geography\n import random\n \n def a_biography():\n return \"{0} is a(n) {1} who lives in {2}.\".format(\n random.choice(humans.firstNames[\"firstNames\"]),\n random.choice(humans.occupations[\"occupations\"]),\n random.choice(geography.us_cities[\"cities\"])[\"city\"])\n \n for i in range(5):\n print a_biography()\n\n # outputs (e.g.):\n # Jessica is a(n) ceiling tile installer who lives in Grand Forks.\n # Kayla is a(n) substance abuse social worker who lives in Torrance.\n # Luis is a(n) hydrologist who lives in Saginaw.\n # Leah is a(n) heating installer who lives in Danville.\n # Grant is a(n) building inspector who lives in Vineland.\n\nAutomated pizza topping-related boasts about your inebriation::\n\n from pycorpora import words, foods\n import random\n\n # \"I'm so smashed I could eat a pizza with spinach, cheese, *and* hot sauce.\"\n print \"I'm so {0} I could eat a pizza with {1}, {2}, *and* {3}.\".format(\n random.choice(words.states_of_drunkenness[\"states_of_drunkenness\"]),\n *random.sample(foods.pizzaToppings[\"pizzaToppings\"], 3))\n\nThe possibilities... are endless.\n\nHistory\n-------\n\n* 0.1.2: Python 3 compatibility (contributed by Sam Raker); vastly improved\n build process (contributed by Hugo van Kemenade).\n\nLicense\n-------\n\nThe ``pycorpora`` package is MIT licensed (see LICENSE.txt). The data in the\nCorpora Project is itself in the public domain (CC0).\n\nAcknowledgements\n----------------\n\nThanks to Darius Kazemi and all of the Corpora Project contributors!\n\nThis package was developed as part of my Spring 2015 research fellowship at\n`ITP `_. Thank you to the program and its students for\ntheir interest and support!", "description_content_type": null, "docs_url": null, "download_url": null, "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/aparrish/pycorpora", "keywords": "nlp corpus text language", "license": "LICENSE.txt", "maintainer": null, "maintainer_email": null, "name": "pycorpora", "package_url": "https://pypi.org/project/pycorpora/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/pycorpora/", "project_urls": { "Homepage": "https://github.com/aparrish/pycorpora" }, "release_url": "https://pypi.org/project/pycorpora/0.1.2/", "requires_dist": null, "requires_python": null, "summary": "A Python wrapper for Darius Kazemi's Corpora Project", "version": "0.1.2" }, "last_serial": 1603769, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "8c24c05e2901128791f9f4ecfe278bb8", "sha256": "6f79178e28523085f1f772fc01aa49bbe849d9c92a604163637edeac5a1babe9" }, "downloads": -1, "filename": "pycorpora-0.1.0.tar.gz", "has_sig": false, "md5_digest": "8c24c05e2901128791f9f4ecfe278bb8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6362, "upload_time": "2015-03-28T22:56:21", "url": "https://files.pythonhosted.org/packages/fa/5b/3caa30bc03d22d5f0bdaa6a1abf070aeb27a371ddcf7527b8a0dcae0b7e6/pycorpora-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "e0d7dfad9e6e4a48198e63bd51b9dc7e", "sha256": "1d5a20d94f9cd978e9eed997702d9aa4241be324064a4478adde237a0536d060" }, "downloads": -1, "filename": "pycorpora-0.1.1.tar.gz", "has_sig": false, "md5_digest": "e0d7dfad9e6e4a48198e63bd51b9dc7e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6383, "upload_time": "2015-03-28T23:14:34", "url": "https://files.pythonhosted.org/packages/24/4d/15902b470016f510faf32e8bf64d09ac8f8b3ca18ba43f92a6f992f15807/pycorpora-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "4d9d92dfc0c8a0e69156c2f517a4eb96", "sha256": "86f4318e5d75b2d79024c10f92e26761921baa8cad648c9331f68047abff9de2" }, "downloads": -1, "filename": "pycorpora-0.1.2.tar.gz", "has_sig": false, "md5_digest": "4d9d92dfc0c8a0e69156c2f517a4eb96", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6685, "upload_time": "2015-06-23T20:21:42", "url": "https://files.pythonhosted.org/packages/64/84/6d57f3956cbfd247e44b7b686fd7a431bb6e2e7c75e72a38c8e3672374e3/pycorpora-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4d9d92dfc0c8a0e69156c2f517a4eb96", "sha256": "86f4318e5d75b2d79024c10f92e26761921baa8cad648c9331f68047abff9de2" }, "downloads": -1, "filename": "pycorpora-0.1.2.tar.gz", "has_sig": false, "md5_digest": "4d9d92dfc0c8a0e69156c2f517a4eb96", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6685, "upload_time": "2015-06-23T20:21:42", "url": "https://files.pythonhosted.org/packages/64/84/6d57f3956cbfd247e44b7b686fd7a431bb6e2e7c75e72a38c8e3672374e3/pycorpora-0.1.2.tar.gz" } ] }