{ "info": { "author": "Michael Penkov", "author_email": "misha.penkov@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5" ], "description": "==========\ncsvinsight\n==========\n\n\n.. image:: https://img.shields.io/pypi/v/csvinsight.svg\n :target: https://pypi.python.org/pypi/csvinsight\n\n.. image:: https://circleci.com/gh/ProfoundNetworks/csvinsight.svg?style=shield&circle-token=:circle-token\n :target: https://circleci.com/gh/ProfoundNetworks/csvinsight\n :alt: Build Status\n\n.. image:: https://readthedocs.org/projects/csvinsight/badge/?version=latest\n :target: https://csvinsight.readthedocs.io/en/latest/?badge=latest\n :alt: Documentation Status\n\n.. image:: https://pyup.io/repos/github/ProfoundNetworks/csvinsight/shield.svg\n :target: https://pyup.io/repos/github/ProfoundNetworks/csvinsight/\n :alt: Updates\n\n\nFast & simple summary for large CSV files\n\n\n* Free software: MIT license\n* Documentation: https://csvinsight.readthedocs.io.\n\n\nFeatures\n--------\n\n* Calculates basic stats for each column: max, min, mean length; number of non-empty values\n* Calculates exact number of unique values and the top 20 most frequent values\n* Supports non-orthogonal data (list fields)\n* Works with very large files: does not load the entire CSV into memory\n* Fast splitting of CSVs into columns, one file per column\n* Multiprocessing-enabled\n\nExample Usage\n-------------\n\nGiven a CSV file::\n\n bash-3.2$ cat tests/sampledata.csv\n name|age|fave_color\n Alexey|33|red;yellow\n Boris|31|blue\n Valentina|0|\n\nyou can obtain a CsvInsight report with::\n\n bash-3.2$ csvi tests/sampledata.csv --list-fields fave_color\n CSV Insight Report\n Total # Rows: 3\n Column counts:\n 3 columns -> 3 rows\n\n Report Format:\n Column Number. Column Header -> Uniques: # ; Fills: # ; Fill Rate:\n Field Length: min #, max #, average:\n Top n field values -> Dupe Counts\n\n\n 1. name -> Uniques: 3 ; Fills: 3 ; Fill Rate: 100.0%\n Field Length: min 5, max 9, avg 6.67\n Counts Percent Field Value\n 1 33.33 % Valentina\n 1 33.33 % Boris\n 1 33.33 % Alexey\n\n 2. age -> Uniques: 3 ; Fills: 3 ; Fill Rate: 100.0%\n Field Length: min 1, max 2, avg 1.67\n Counts Percent Field Value\n 1 33.33 % 33\n 1 33.33 % 31\n 1 33.33 % 0\n\n 3. fave_color -> Uniques: 4 ; Fills: 3 ; Fill Rate: 75.0%\n Field Length: min 0, max 6, avg 3.25\n Counts Percent Field Value\n 1 25.00 % yellow\n 1 25.00 % red\n 1 25.00 % blue\n 1 25.00 % NULL\n\nSince CSV comes in different flavors, you may need to tweak the underlying CSV parser's parameters to read your file successfully.\nCSVInsight handles this via CSV dialects.\nFor example, to read a comma-separated file, you would use the following command::\n\n bash-3.2$ csvi your/file.csv --dialect delimiter=,\n\nYou may combine as many dialect parameters as needed::\n\n bash-3.2$ csvi your/file.csv --dialect delimiter=, quoting=QUOTE_NONE\n\nFor a full list of dialect parameters, see the documentation for Python's `csv module `_.\nConstant values like QUOTE_NONE are resolved automagically.\n\nOnce you've discovered the winning parameter combination for your file, save it to a YAML file::\n\n list_fields:\n - fave_color\n - another_field_name\n list_separator: ;\n dialect:\n - \"delimiter=|\"\n - \"quoting=QUOTE_NONE\"\n\nYou can then invoke CSVI as follows::\n\n bash-3.2$ csvi your/file.csv --config your/config.yaml\n\nCredits\n---------\n\nThis package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.\n\n.. _Cookiecutter: https://github.com/audreyr/cookiecutter\n.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage\n\n\n=======\nHistory\n=======\n\nUnreleased\n----------\n\n0.3.2 (2019-07-01)\n------------------\n\n* Set the field size limit to sys.maxsize\n\n0.3.1 (2019-06-26)\n------------------\n\n* Make Jupyter notebook an optional dependency\n\n0.3.0 (2018-07-11)\n------------------\n\n* Added --most-common parameter (resolved Issue #14)\n* Added --no-tiny parameter\n* Refactored temporary file naming\n* Improve error message when handling empty CSV files\n* Fixed \"Argument list too long\" error (Issue #15)\n* Added --json parameter\n* Added --ipynb parameter to generate IPython notebook\n\n0.2.3 (2017-12-09)\n------------------\n\n* Fix bug: Unicode column names now work under Py2\n\n0.2.2 (2017-12-04)\n------------------\n\n* Fix bug: Unicode characters no longer break CsvInsight on Py2\n\n0.2.1 (2017-11-27)\n------------------\n\n* Fix bug: opening gzipped files with Py3 now works\n\n0.2.0 (2017-11-25)\n------------------\n\n* Split files using gsplit and process them in parallel for faster processing\n* No longer work with streams; works exclusively with files\n* Get rid of csvi_summarize and csvi_split entry points\n* Integrated plumbum for cleaner pipelines\n* Fixed issue #11: added support for more CSV parameters via the --dialect option\n* Fixed issue #10: reading from empty files no longer raises StopIteration\n* Fixed issue #8: use the correct link to the GitHub project in the documentation\n* Fixed issue #2: implemented in-memory mode for smaller files\n\n0.1.0 (2017-10-29)\n------------------\n\n* First release on PyPI.", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ProfoundNetworks/csvinsight", "keywords": "csvinsight", "license": "MIT license", "maintainer": "", "maintainer_email": "", "name": "csvinsight", "package_url": "https://pypi.org/project/csvinsight/", "platform": "", "project_url": "https://pypi.org/project/csvinsight/", "project_urls": { "Homepage": "https://github.com/ProfoundNetworks/csvinsight" }, "release_url": "https://pypi.org/project/csvinsight/0.3.2/", "requires_dist": null, "requires_python": "", "summary": "Fast & simple summary for large CSV files", "version": "0.3.2" }, "last_serial": 5468252, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "f0afa144e1909a1c9a2e9b501f7dd0c3", "sha256": "0d1b19c156be72fa767689c7c4534d9d03f3dc4b130dca0a0b909b06f3551e85" }, "downloads": -1, "filename": "csvinsight-0.1.0.tar.gz", "has_sig": false, "md5_digest": "f0afa144e1909a1c9a2e9b501f7dd0c3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24913, "upload_time": "2017-10-29T14:00:23", "url": "https://files.pythonhosted.org/packages/22/86/2d6912f60c4b4be14ffe2c5ded332d508370d0bbaba6acb279b462d02a27/csvinsight-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "394e3ad7133a7df1549f5f61836fd150", "sha256": "a0f1322cbdfd4a14a85a6132741d26cf1bc69c066f8522693b23420c0994f240" }, "downloads": -1, "filename": "csvinsight-0.2.0.tar.gz", "has_sig": false, "md5_digest": "394e3ad7133a7df1549f5f61836fd150", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23901, "upload_time": "2017-11-25T10:25:53", "url": "https://files.pythonhosted.org/packages/4e/a7/20ccafaa34489c576dc0646c31e1e6e85252061c9e962fd0a1bd26d49bd8/csvinsight-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "d71b2321612a48813935835dc70d55dd", "sha256": "f4e9bd15fc0be96fa61a019780d6a9e234aba12b3709f73c863c400f5486c081" }, "downloads": -1, "filename": "csvinsight-0.2.1.tar.gz", "has_sig": false, "md5_digest": "d71b2321612a48813935835dc70d55dd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24127, "upload_time": "2017-11-27T09:37:34", "url": "https://files.pythonhosted.org/packages/df/84/90328262c445eb70ad905f397cb23cdbf59bb7a690360e14ca0ce3b5361a/csvinsight-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "0d65e40e6b8c6a3164873c7a724d965d", "sha256": "c837f42774c0404a3ef78e84e958e439d95575ac7b8b809de14760c43241eca7" }, "downloads": -1, "filename": "csvinsight-0.2.2.tar.gz", "has_sig": false, "md5_digest": "0d65e40e6b8c6a3164873c7a724d965d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24204, "upload_time": "2017-12-04T14:03:12", "url": "https://files.pythonhosted.org/packages/08/bb/2c0da8d4ce715cd4eadb98cc3dbb2c8d65ca1b9053e48bf10e6cffa70d24/csvinsight-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "c19320a5237c60a340b1da6d4441e8d3", "sha256": "90beadecac6d12f0f2b3d863f2974a237562182e7a3a7eb3b98a5e7a62d484bc" }, "downloads": -1, "filename": "csvinsight-0.2.3.tar.gz", "has_sig": false, "md5_digest": "c19320a5237c60a340b1da6d4441e8d3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24072, "upload_time": "2017-12-09T14:03:58", "url": "https://files.pythonhosted.org/packages/6c/2e/8826b1edcab314f07f1d37a57cc65cc672dd7178ac4ef096e028d5391ac1/csvinsight-0.2.3.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "2e2acb7bab2c5e759824f4acaeafbaee", "sha256": "89f698886126ca9698f30d23702d048e792caa9744e4ff42e87dcc7366fe982c" }, "downloads": -1, "filename": "csvinsight-0.3.0.tar.gz", "has_sig": false, "md5_digest": "2e2acb7bab2c5e759824f4acaeafbaee", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27644, "upload_time": "2018-07-11T05:22:33", "url": "https://files.pythonhosted.org/packages/46/8a/38582b9e04ba980bb82c2b63dba4789686c6a9a6e900d8512a6c9089c1e1/csvinsight-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "700d422533f9bb4dd16a5adda39fecc7", "sha256": "687a48a67d1b1f15f6f6be60bc4218930051fb70d4729e32cf676d4bbfcfcbe6" }, "downloads": -1, "filename": "csvinsight-0.3.1.tar.gz", "has_sig": false, "md5_digest": "700d422533f9bb4dd16a5adda39fecc7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 34619, "upload_time": "2019-06-26T03:37:54", "url": "https://files.pythonhosted.org/packages/8b/30/a3a689e7b8e7544c5c2c5ec0d8c95beaded911dbb330bafd8bda1d92d632/csvinsight-0.3.1.tar.gz" } ], "0.3.2": [ { "comment_text": "", "digests": { "md5": "c5d0603f6f647473cd8e27c04db3486c", "sha256": "24f7ba4158fd050ecd21afe97a0de35b2f0a50886ebf68f10025adbeb4b78c3e" }, "downloads": -1, "filename": "csvinsight-0.3.2.tar.gz", "has_sig": false, "md5_digest": "c5d0603f6f647473cd8e27c04db3486c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 34735, "upload_time": "2019-06-30T16:04:07", "url": "https://files.pythonhosted.org/packages/d5/29/914d3d6f362da6fcf5cf7390ccf232a606f7df50c37a67f4c44cb9654c70/csvinsight-0.3.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c5d0603f6f647473cd8e27c04db3486c", "sha256": "24f7ba4158fd050ecd21afe97a0de35b2f0a50886ebf68f10025adbeb4b78c3e" }, "downloads": -1, "filename": "csvinsight-0.3.2.tar.gz", "has_sig": false, "md5_digest": "c5d0603f6f647473cd8e27c04db3486c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 34735, "upload_time": "2019-06-30T16:04:07", "url": "https://files.pythonhosted.org/packages/d5/29/914d3d6f362da6fcf5cf7390ccf232a606f7df50c37a67f4c44cb9654c70/csvinsight-0.3.2.tar.gz" } ] }