{ "info": { "author": "Jonathon Morgan", "author_email": "jonathon@goodattheinternet.com", "bugtrack_url": null, "classifiers": [], "description": "Penny\n=====\n\nInspect your data. Find the truth.\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. figure:: http://www.martianwatches.com/wp-content/uploads/2013/10/InspectorGadget.jpg\n :alt: alt tag\n\n alt tag\nUncle Gadget was great and all, but when it came to real detective work,\nwe all know Penny did the heavy lifting. Hence, Penny, the Python module\nthat inspects stuff. Feed it rows or columns from a dataset, and get\ninformation about the column types -- including whether or not a given\ncolumn represents a category or date. Penny also finds column headers\n(waaaay more reliably than the ``Sniffer`` class in to the standard\n``csv`` module).\n\nWhy?\n~~~~\n\nIf you're working with a few datasets, it's easy to figure out which\ncolumns are supposed to be dates, integers and even categories just by\nlooking at the raw csv files. But if you need to programmatically deal\nwith lots of datasets, this gets tedious fast.\n\nSetup\n~~~~~\n\nGrab the package.\n\n::\n\n pip install penny\n\nOr grab the code from GitHub.\n\n::\n\n git clone https://github.com/gati/penny\n cd penny\n pip install -r requirements.txt\n\nGetting Started\n~~~~~~~~~~~~~~~\n\nGuess the headers of a csv file.\n\n.. code:: python\n\n from penny.headers import get_headers\n\n with open('your-awesome-file.csv') as csvfile:\n has_header, headers = get_headers(csvfile)\n \n # Prints True/False depending on whether or not headers were found\n print has_header \n\n # Prints column headers or placeholders if real headers weren't found\n print headers # ['Example Header A', 'Example Header B']\n\nGuess the data type of a column in your dataset.\n\n.. code:: python\n\n from penny.inspectors import column_types_probabilities\n\n fileobj = open('your-awesome-file.csv')\n rows = list(csv.reader(fileobj))\n\n # Get the values from column 0\n column_0 = [x[0] for x in rows]\n probs = column_types_probabilities(column_0)\n\n # Prints something like {'date': 1, 'int': .75, 'category': 0 ...}\n print probs\n\nOr get type guesses for all the rows in your dataset at once.\n\n.. code:: python\n\n from penny.inspectors import rows_types_probabilities\n\n fileobj = open('your-awesome-file.csv')\n rows = list(csv.reader(fileobj))\n probs = rows_types_probabilities(rows)\n\nPenny checks for a lot of data \"types,\" not just the standard ``int``,\n``str``, etc. Here's the list (for now):\n\n- **date** something ``dateutil.parser`` can parse into a ``datetime``\n object\n- **int** a whole number\n- **bool** y/n or yes/no or something true/falsey\n- **float** a number with a decimal\n- **category** something you might want to group records by\n- **text** string longer than 90 characters (something you could get\n names/places/sentiment/etc from)\n- **id** unique for each row\n- **coord** a float that might be a latitude or longitude\n- **coord\\_pair** string that looks like \"coord,coord\"\n- **proportion** column where all values sum to 1 or 100\n- **street** house number + street name\n- **city** one of the world's 80,000 largest cities\n- **region** smaller than a country, bigger than a city. state,\n province, etc\n- **country** a country name on the `ISO 3166\n list `__\n- **phone** a phone number\n- **email** an email address\n- **url** web address with or without http:// (so http://google.com or\n google.com)\n- **address** a full address you could geocode with a service like\n Google Maps\n\nLast but not least, you can also inspect a column for a single type.\n\n.. code:: python\n\n from penny.list_check import column_probability_for_type\n\n fileobj = open('your-awesome-file.csv')\n rows = list(csv.reader(fileobj))\n\n # Get the values from column 0\n column_0 = [x[0] for x in rows]\n prob = column_probability_for_type(column_0, 'date')\n\n # Prints something like 0.78\n print prob\n\nContributing & Credits\n~~~~~~~~~~~~~~~~~~~~~~\n\nThis is a work in progress, so pull request at will. Some of this work\nwas inspired by `messytables `__,\nwhich looks great for xls files but wasn't quite what I needed. Thanks\nto `Chris Albon `__ for putting together\na `repo of useful test\ndatasets `__.\n\nQuestions, concerns, devoted fan mail to\n[@jonathonmorgan](http://twitter.com/jonathonmorgan) on Twitter.", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/gati/penny/tarball/0.4.1", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/gati/penny", "keywords": null, "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "penny", "package_url": "https://pypi.org/project/penny/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/penny/", "project_urls": { "Download": "https://github.com/gati/penny/tarball/0.4.1", "Homepage": "https://github.com/gati/penny" }, "release_url": "https://pypi.org/project/penny/0.4.1/", "requires_dist": null, "requires_python": null, "summary": "Inspect your data. Find the truth!", "version": "0.4.1" }, "last_serial": 1393971, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "47bb61ad97f6b7cae43e9a287559ca2a", "sha256": "d00e592673e4a0f15201a04408a6db5e3f6c5c52845e319536ea79d86ceeffd8" }, "downloads": -1, "filename": "penny-0.1.0.tar.gz", "has_sig": false, "md5_digest": "47bb61ad97f6b7cae43e9a287559ca2a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6552, "upload_time": "2014-09-21T20:43:49", "url": "https://files.pythonhosted.org/packages/51/b5/65a457aad130c630d23700ed2953b28e1337dede69374aeab28a6e888f96/penny-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "754450968175fd92ac4c492513c4fb00", "sha256": "08692b19658c6ad70bc1970b456b45703101926391b878d85766318744c770a6" }, "downloads": -1, "filename": "penny-0.1.1.tar.gz", "has_sig": false, "md5_digest": "754450968175fd92ac4c492513c4fb00", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6779, "upload_time": "2014-09-21T20:45:33", "url": "https://files.pythonhosted.org/packages/a2/99/ccac4023af30fbb9ecb2dbca82b83bb48e053cd9e0b4a04cdb13871ced1f/penny-0.1.1.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "a399df1225c6caa11286fe0fbb520fd9", "sha256": "1aed07569834a3d72d179bc7a9fb94a07832bd614cd5e7485644a32f8db7eb91" }, "downloads": -1, "filename": "penny-0.2.0.tar.gz", "has_sig": false, "md5_digest": "a399df1225c6caa11286fe0fbb520fd9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7506, "upload_time": "2014-09-23T03:31:51", "url": "https://files.pythonhosted.org/packages/10/ae/b2ab33fe3971805724f8f79405574e2c3f5704d8e0cddfd3097f78a9a3ff/penny-0.2.0.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "c48174df5c50938315a878f057fc9d2a", "sha256": "667d963f56fe6b19263a594edd2c63e98adf080c41b1407f3a066728d4a181f2" }, "downloads": -1, "filename": "penny-0.3.0.tar.gz", "has_sig": false, "md5_digest": "c48174df5c50938315a878f057fc9d2a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1285111, "upload_time": "2014-09-24T16:47:54", "url": "https://files.pythonhosted.org/packages/53/21/de4616a46e9762a58d1d619dc008c1eda994532b158219a92bdd265dedf6/penny-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "55157935b4587be6ccfd46372179146c", "sha256": "796f851f1c39c62671a85531ece6a46da95175d1be2c35412d1352f48e83d4bd" }, "downloads": -1, "filename": "penny-0.3.1.tar.gz", "has_sig": false, "md5_digest": "55157935b4587be6ccfd46372179146c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1285594, "upload_time": "2014-09-24T17:07:35", "url": "https://files.pythonhosted.org/packages/7e/96/0dc56249c38a94ff3cd9b70d4ad74c53d13604216121263a6dcdd4b6ad79/penny-0.3.1.tar.gz" } ], "0.3.10": [ { "comment_text": "", "digests": { "md5": "95a2cf8a008a16c49ab3430c245b4b40", "sha256": "3948908470c2ce4fac1d1a11c849bf7b1bf700d938f43484e2ef9e7c297f38e5" }, "downloads": -1, "filename": "penny-0.3.10.tar.gz", "has_sig": false, "md5_digest": "95a2cf8a008a16c49ab3430c245b4b40", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1287859, "upload_time": "2014-10-21T18:20:13", "url": "https://files.pythonhosted.org/packages/b7/66/67b2e10fe629a5304127a441deb082229c069058abcfa9aac2b0f3fce044/penny-0.3.10.tar.gz" } ], "0.3.11": [ { "comment_text": "", "digests": { "md5": "2291b356a77f5b00c0c85619540a00eb", "sha256": "49df9885e056a3b160450812397583a5f1e0a190b36bcb45f25c099beafb9ae8" }, "downloads": -1, "filename": "penny-0.3.11.tar.gz", "has_sig": false, "md5_digest": "2291b356a77f5b00c0c85619540a00eb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1287702, "upload_time": "2014-12-04T19:52:24", "url": "https://files.pythonhosted.org/packages/da/5f/79b62887c1682af998b127916cc9c97eacee8cf2ed9da7025c17d3a5c600/penny-0.3.11.tar.gz" } ], "0.3.2": [ { "comment_text": "", "digests": { "md5": "e256ca26a5634cf5b24c4b5eda693bd1", "sha256": "adfa23ad6b2a8498dc07c5a6eb3845da27bd89a9b1262a253795c381d1285b14" }, "downloads": -1, "filename": "penny-0.3.2.tar.gz", "has_sig": false, "md5_digest": "e256ca26a5634cf5b24c4b5eda693bd1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1286294, "upload_time": "2014-09-30T18:16:32", "url": "https://files.pythonhosted.org/packages/ba/d6/610f1648e0511aa6c218e61007c10b814cfc1f52966b37b1f47b2253aa54/penny-0.3.2.tar.gz" } ], "0.3.3": [ { "comment_text": "", "digests": { "md5": "9e1705c3db6cf72f8e9b275daea7345e", "sha256": "49b281ebda9e213583c76b25fdc034e38cc6c4f6e08c019d311835c2dcecb78d" }, "downloads": -1, "filename": "penny-0.3.3.tar.gz", "has_sig": false, "md5_digest": "9e1705c3db6cf72f8e9b275daea7345e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1286507, "upload_time": "2014-10-02T15:58:46", "url": "https://files.pythonhosted.org/packages/e6/59/3682e3b720a7d3c280b13df64a66e0ba3c5edccf3708f9e128f60782cad3/penny-0.3.3.tar.gz" } ], "0.3.4": [ { "comment_text": "", "digests": { "md5": "0e80dd80986d71f7e147ff5c446fa1c0", "sha256": "ec5c7f1a6e19b80f25dc8c1b20308e0575a37028c0d3ce5c072390f423e69963" }, "downloads": -1, "filename": "penny-0.3.4.tar.gz", "has_sig": false, "md5_digest": "0e80dd80986d71f7e147ff5c446fa1c0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1286510, "upload_time": "2014-10-02T16:17:53", "url": "https://files.pythonhosted.org/packages/af/b1/8efcb5746dc2bf181376f4dd2577927efc4a0b4e4e1820b814f30f5311f5/penny-0.3.4.tar.gz" } ], "0.3.5": [ { "comment_text": "", "digests": { "md5": "4380436cb8fb63e07a0d545a42a1e6f6", "sha256": "c3c7e8509d230b0bffa84eeb17f5f0bac9c4c56b892fbddf961c86e545d98ba7" }, "downloads": -1, "filename": "penny-0.3.5.tar.gz", "has_sig": false, "md5_digest": "4380436cb8fb63e07a0d545a42a1e6f6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1286848, "upload_time": "2014-10-02T16:56:24", "url": "https://files.pythonhosted.org/packages/47/9e/21da14783427fe16c6f6dfa5bdc228dbd1367e01f0af8845fb9b5860406c/penny-0.3.5.tar.gz" } ], "0.3.6": [ { "comment_text": "", "digests": { "md5": "80dbc7b5f84faf20f9d36824c13aebf6", "sha256": "cf24819d885e712e70ce6c7d956df2ebabb12ce4e7fea0a31cbca20c70f904f1" }, "downloads": -1, "filename": "penny-0.3.6.tar.gz", "has_sig": false, "md5_digest": "80dbc7b5f84faf20f9d36824c13aebf6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1287500, "upload_time": "2014-10-03T17:18:39", "url": "https://files.pythonhosted.org/packages/56/56/d4ec6ee80dafe0d4f170b47990b65e1b20a5e063e1608dc71457d380a852/penny-0.3.6.tar.gz" } ], "0.3.9": [ { "comment_text": "", "digests": { "md5": "a95d0b16836b569c4ac7d0160a272d52", "sha256": "0d10130fb51525c6526aeb8363d1a2cdb45befe88e1b767c002a5d2ea16e6472" }, "downloads": -1, "filename": "penny-0.3.9.tar.gz", "has_sig": false, "md5_digest": "a95d0b16836b569c4ac7d0160a272d52", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1287700, "upload_time": "2014-10-20T22:56:08", "url": "https://files.pythonhosted.org/packages/eb/40/c8a2eafc8965c92e31c32d634ff4ba89766198c93b030530911a5c93e4d4/penny-0.3.9.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "8455685ccfbcfc2d865e442f09651828", "sha256": "5b6f639293a0c5e27d4005d02e803e3df928e91b622ebe103972a9f29bdd27eb" }, "downloads": -1, "filename": "penny-0.4.1.tar.gz", "has_sig": false, "md5_digest": "8455685ccfbcfc2d865e442f09651828", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1288128, "upload_time": "2015-01-23T21:23:51", "url": "https://files.pythonhosted.org/packages/69/b6/20c4bd007ab6f1cab009cd0108aadf411634de6ba8e405e97da296c4d8f4/penny-0.4.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8455685ccfbcfc2d865e442f09651828", "sha256": "5b6f639293a0c5e27d4005d02e803e3df928e91b622ebe103972a9f29bdd27eb" }, "downloads": -1, "filename": "penny-0.4.1.tar.gz", "has_sig": false, "md5_digest": "8455685ccfbcfc2d865e442f09651828", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1288128, "upload_time": "2015-01-23T21:23:51", "url": "https://files.pythonhosted.org/packages/69/b6/20c4bd007ab6f1cab009cd0108aadf411634de6ba8e405e97da296c4d8f4/penny-0.4.1.tar.gz" } ] }