{ "info": { "author": "Joe Crobak", "author_email": "joecrow@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: System Administrators", "License :: OSI Approved :: Apache Software License", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy" ], "description": "parquet-python\n==============\n\n.. image:: https://travis-ci.org/jcrobak/parquet-python.svg?branch=master\n :target: https://travis-ci.org/jcrobak/parquet-python\n\nparquet-python is a pure-python implementation (currently with only\nread-support) of the `parquet\nformat `_. It comes with a\nscript for reading parquet files and outputting the data to stdout as\nJSON or TSV (without the overhead of JVM startup). Performance has not\nyet been optimized, but it's useful for debugging and quick viewing of\ndata in files.\n\nNot all parts of the parquet-format have been implemented yet or tested\ne.g. nested data\u2014see Todos below for a full list. With that said,\nparquet-python is capable of reading all the data files from the\n`parquet-compatability `_\nproject.\n\nrequirements\n============\n\nparquet-python has been tested on python 2.7, 3.4, and 3.5. It depends\non ``thrift`` (0.9) and ``python-snappy`` (for snappy compressed files).\n\ngetting started\n===============\n\nparquet-python is available via PyPi and can be installed using\n`pip install parquet`. The package includes the `parquet`\ncommand for reading python files, e.g. `parquet test.parquet`.\nSee `parquet --help` for full usage.\n\nExample\n-------\n\nparquet-python currently has two programatic interfaces with similar\nfunctionality to Python's csv reader. First, it supports a DictReader\nwhich returns a dictionary per row. Second, it has a reader which\nreturns a list of values for each row. Both function require a file-like\nobject and support an optional ``columns`` field to only read the\nspecified columns.\n\n.. code:: python\n\n\n import parquet\n import json\n\n ## assuming parquet file with two rows and three columns:\n ## foo bar baz\n ## 1 2 3\n ## 4 5 6\n\n with open(\"test.parquet\") as fo:\n # prints:\n # {\"foo\": 1, \"bar\": 2}\n # {\"foo\": 4, \"bar\": 5}\n for row in parquet.DictReader(fo, columns=['foo', 'bar']):\n print(json.dumps(row))\n\n\n with open(\"test.parquet\") as fo:\n # prints:\n # 1,2\n # 4,5\n for row in parquet.reader(fo, columns=['foo', 'bar]):\n print(\",\".join([str(r) for r in row]))\n\nTodos\n=====\n\n- Support the deprecated bitpacking\n- Fix handling of repetition-levels and definition-levels\n- Tests for nested schemas, null data\n- Support reading of data from HDFS via snakebite and/or webhdfs.\n- Implement writing\n- performance evaluation and optimization (i.e. how does it compare to\n the c++, java implementations)\n\nContributing\n============\n\nIs done via Pull Requests. Please include tests with your changes and\nfollow `pep8 `_.\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/jcrobak/parquet-python", "keywords": "", "license": "Apache License 2.0", "maintainer": "", "maintainer_email": "", "name": "parquet", "package_url": "https://pypi.org/project/parquet/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/parquet/", "project_urls": { "Homepage": "https://github.com/jcrobak/parquet-python" }, "release_url": "https://pypi.org/project/parquet/1.2/", "requires_dist": [ "python-snappy", "thriftpy (>=0.3.6)", "backports.csv; python_version==\"2.7\"" ], "requires_python": "", "summary": "Python support for Parquet file format", "version": "1.2" }, "last_serial": 2899479, "releases": { "0.0.0": [], "1.0": [ { "comment_text": "", "digests": { "md5": "2324d29645b302ffc4390942c4683681", "sha256": "ccb686ad551756c0873b4b4c0f18fe1e35865d1248d86b2a8506f96e7b6c72cc" }, "downloads": -1, "filename": "parquet-1.0.tar.gz", "has_sig": false, "md5_digest": "2324d29645b302ffc4390942c4683681", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16332, "upload_time": "2016-03-14T03:06:58", "url": "https://files.pythonhosted.org/packages/cf/eb/64fe14d3b477537920e7bc80df394b147eabec04e0620f629e8265017470/parquet-1.0.tar.gz" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "06b5483e47506c742836ea85a630031f", "sha256": "bcc318decb1d6f14d779838a6a8206840cd4febdaa923b9139b6b1bd9a71f1f6" }, "downloads": -1, "filename": "parquet-1.1-py2-none-any.whl", "has_sig": true, "md5_digest": "06b5483e47506c742836ea85a630031f", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 18254, "upload_time": "2016-08-16T02:18:32", "url": "https://files.pythonhosted.org/packages/86/3e/6abf522cb2543104623b842e91e3f59bec58112d29d86a6f9b8d02825e43/parquet-1.1-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "36e9cd8dcf8fa7c12ee7b10bba3f7f9b", "sha256": "6116d9722c2eedbd34375f5a3a93a6cb31c86ec7d67225918d0caa4593447565" }, "downloads": -1, "filename": "parquet-1.1-py3-none-any.whl", "has_sig": true, "md5_digest": "36e9cd8dcf8fa7c12ee7b10bba3f7f9b", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18256, "upload_time": "2016-08-16T02:18:36", "url": "https://files.pythonhosted.org/packages/8d/e3/a4976155440533ccde527f65d4783e77b7a1ad3a66c9de223cc181deebd9/parquet-1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1882bb10b2cc4ac613a9bddda784986d", "sha256": "a962f4ad7581b1f1c689989c85d69eff6380d68418538546a296168e8acfbe8f" }, "downloads": -1, "filename": "parquet-1.1.tar.gz", "has_sig": true, "md5_digest": "1882bb10b2cc4ac613a9bddda784986d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18230, "upload_time": "2016-08-16T02:18:39", "url": "https://files.pythonhosted.org/packages/6f/25/ca111c6428ad610b617469c6ec0a90783b9b1fcd50e2335a2f1e7a6521c6/parquet-1.1.tar.gz" } ], "1.2": [ { "comment_text": "", "digests": { "md5": "63f9785af4d486dcfd708846e0590a55", "sha256": "ff39f63160a1b6226eb0257c0cd6a3d6f015e10681bdd6e4e0713c9df5e8b94e" }, "downloads": -1, "filename": "parquet-1.2-py2-none-any.whl", "has_sig": true, "md5_digest": "63f9785af4d486dcfd708846e0590a55", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 20231, "upload_time": "2017-05-26T02:33:10", "url": "https://files.pythonhosted.org/packages/4b/a3/c0aae38ac1bc7137a510d326fb99482ddd6cb6d468e9875e28be011bb833/parquet-1.2-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c536aaea853f87cd685e0573e86c7a5f", "sha256": "67a9ac65b3748a4ae1185facd70540cfb5534416b43d0a1650422dbb4f52eb91" }, "downloads": -1, "filename": "parquet-1.2-py3-none-any.whl", "has_sig": true, "md5_digest": "c536aaea853f87cd685e0573e86c7a5f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20231, "upload_time": "2017-05-26T02:33:12", "url": "https://files.pythonhosted.org/packages/d4/2c/31867848b0238fb1cf0b2fcb60296b3bd7e3c455b97c92026b6be652d34c/parquet-1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "05aacec0620ac63ecd7dd77bf7fb9fee", "sha256": "5b45b63f3381af8d059ecc301954fa15babb6ba96e95939382e42c94520e8045" }, "downloads": -1, "filename": "parquet-1.2.tar.gz", "has_sig": true, "md5_digest": "05aacec0620ac63ecd7dd77bf7fb9fee", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21453, "upload_time": "2017-05-26T02:33:14", "url": "https://files.pythonhosted.org/packages/74/b5/bc459aab0566fc3cf3397467922c37411ab6e3361bab9e0ca165e1089ce8/parquet-1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "63f9785af4d486dcfd708846e0590a55", "sha256": "ff39f63160a1b6226eb0257c0cd6a3d6f015e10681bdd6e4e0713c9df5e8b94e" }, "downloads": -1, "filename": "parquet-1.2-py2-none-any.whl", "has_sig": true, "md5_digest": "63f9785af4d486dcfd708846e0590a55", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 20231, "upload_time": "2017-05-26T02:33:10", "url": "https://files.pythonhosted.org/packages/4b/a3/c0aae38ac1bc7137a510d326fb99482ddd6cb6d468e9875e28be011bb833/parquet-1.2-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c536aaea853f87cd685e0573e86c7a5f", "sha256": "67a9ac65b3748a4ae1185facd70540cfb5534416b43d0a1650422dbb4f52eb91" }, "downloads": -1, "filename": "parquet-1.2-py3-none-any.whl", "has_sig": true, "md5_digest": "c536aaea853f87cd685e0573e86c7a5f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20231, "upload_time": "2017-05-26T02:33:12", "url": "https://files.pythonhosted.org/packages/d4/2c/31867848b0238fb1cf0b2fcb60296b3bd7e3c455b97c92026b6be652d34c/parquet-1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "05aacec0620ac63ecd7dd77bf7fb9fee", "sha256": "5b45b63f3381af8d059ecc301954fa15babb6ba96e95939382e42c94520e8045" }, "downloads": -1, "filename": "parquet-1.2.tar.gz", "has_sig": true, "md5_digest": "05aacec0620ac63ecd7dd77bf7fb9fee", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21453, "upload_time": "2017-05-26T02:33:14", "url": "https://files.pythonhosted.org/packages/74/b5/bc459aab0566fc3cf3397467922c37411ab6e3361bab9e0ca165e1089ce8/parquet-1.2.tar.gz" } ] }