{ "info": { "author": "Wes McKinney", "author_email": "wes.mckinney@twosigma.com", "bugtrack_url": null, "classifiers": [ "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5" ], "description": "# impyla\n\nPython client for HiveServer2 implementations (e.g., Impala, Hive) for\ndistributed query engines.\n\nFor higher-level Impala functionality, including a Pandas-like interface over\ndistributed data sets, see the [Ibis project][ibis].\n\n### Features\n\n* HiveServer2 compliant; works with Impala and Hive, including nested data\n\n* Fully [DB API 2.0 (PEP 249)][pep249]-compliant Python client (similar to\nsqlite or MySQL clients) supporting Python 2.6+ and Python 3.3+.\n\n* Works with Kerberos, LDAP, SSL\n\n* [SQLAlchemy][sqlalchemy] connector\n\n* Converter to [pandas][pandas] `DataFrame`, allowing easy integration into the\nPython data stack (including [scikit-learn][sklearn] and\n[matplotlib][matplotlib]); but see the [Ibis project][ibis] for a richer\nexperience\n\n### Dependencies\n\nRequired:\n\n* Python 2.6+ or 3.3+\n\n* `six`, `bit_array`\n\n* `thrift` (on Python 2.x) or `thriftpy` (on Python 3.x)\n\nFor Hive and/or Kerberos support:\n\n```\npip install thrift_sasl\npip install sasl\n```\n\nOptional:\n\n* `pandas` for conversion to `DataFrame` objects; but see the [Ibis project][ibis] instead\n\n* `sqlalchemy` for the SQLAlchemy engine\n\n* `pytest` for running tests; `unittest2` for testing on Python 2.6\n\n\n### Installation\n\nInstall the latest release (`0.13.1`) with `pip`:\n\n```bash\npip install impyla\n```\n\nFor the latest (dev) version, install directly from the repo:\n\n```bash\npip install git+https://github.com/cloudera/impyla.git\n```\n\nor clone the repo:\n\n```bash\ngit clone https://github.com/cloudera/impyla.git\ncd impyla\npython setup.py install\n```\n\n#### Running the tests\n\nimpyla uses the [pytest][pytest] toolchain, and depends on the following\nenvironment variables:\n\n```bash\nexport IMPYLA_TEST_HOST=your.impalad.com\nexport IMPYLA_TEST_PORT=21050\nexport IMPYLA_TEST_AUTH_MECH=NOSASL\n```\n\nTo run the maximal set of tests, run\n\n```bash\ncd path/to/impyla\npy.test --connect impyla\n```\n\nLeave out the `--connect` option to skip tests for DB API compliance.\n\n\n### Usage\n\nImpyla implements the [Python DB API v2.0 (PEP 249)][pep249] database interface\n(refer to it for API details):\n\n```python\nfrom impala.dbapi import connect\nconn = connect(host='my.host.com', port=21050)\ncursor = conn.cursor()\ncursor.execute('SELECT * FROM mytable LIMIT 100')\nprint cursor.description # prints the result set's schema\nresults = cursor.fetchall()\n```\n\nThe `Cursor` object also exposes the iterator interface, which is buffered\n(controlled by `cursor.arraysize`):\n\n```python\ncursor.execute('SELECT * FROM mytable LIMIT 100')\nfor row in cursor:\n process(row)\n```\n\nYou can also get back a pandas DataFrame object\n\n```python\nfrom impala.util import as_pandas\ndf = as_pandas(cur)\n# carry df through scikit-learn, for example\n```\n\n\n[pep249]: http://legacy.python.org/dev/peps/pep-0249/\n[pandas]: http://pandas.pydata.org/\n[sklearn]: http://scikit-learn.org/\n[matplotlib]: http://matplotlib.org/\n[madlib]: http://madlib.net/\n[madlibport]: https://github.com/bitfort/madlibport\n[numba]: http://numba.pydata.org/\n[llvm]: http://llvm.org/\n[pytest]: http://pytest.org/latest/\n[sqlalchemy]: http://www.sqlalchemy.org/\n[ibis]: http://www.ibis-project.org/\n[python-sasl-cython]: https://github.com/laserson/python-sasl/tree/cython/sasl\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/yimian/impyla", "keywords": "cloudera impala python hadoop sql hdfs mpp spark pydata pandas distributed db api pep 249 hive hiveserver2 hs2", "license": "Apache License, Version 2.0", "maintainer": "", "maintainer_email": "", "name": "ym-impyla", "package_url": "https://pypi.org/project/ym-impyla/", "platform": "", "project_url": "https://pypi.org/project/ym-impyla/", "project_urls": { "Homepage": "https://github.com/yimian/impyla" }, "release_url": "https://pypi.org/project/ym-impyla/0.14.0/", "requires_dist": null, "requires_python": "", "summary": "Python client for the Impala distributed query engine", "version": "0.14.0" }, "last_serial": 2497208, "releases": { "0.14.0": [ { "comment_text": "", "digests": { "md5": "483faa690138873b580c181114856025", "sha256": "1eb6f252e54ff554377437c9597224e8032282437ada69b934afde051ab9ac78" }, "downloads": -1, "filename": "ym-impyla-0.14.0.tar.gz", "has_sig": false, "md5_digest": "483faa690138873b580c181114856025", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 140861, "upload_time": "2016-12-03T11:30:07", "url": "https://files.pythonhosted.org/packages/b3/47/4d0cb3e1132f846c2fa17acc9c9b9d7fa6047eccfef895bf76c389a98a8f/ym-impyla-0.14.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "483faa690138873b580c181114856025", "sha256": "1eb6f252e54ff554377437c9597224e8032282437ada69b934afde051ab9ac78" }, "downloads": -1, "filename": "ym-impyla-0.14.0.tar.gz", "has_sig": false, "md5_digest": "483faa690138873b580c181114856025", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 140861, "upload_time": "2016-12-03T11:30:07", "url": "https://files.pythonhosted.org/packages/b3/47/4d0cb3e1132f846c2fa17acc9c9b9d7fa6047eccfef895bf76c389a98a8f/ym-impyla-0.14.0.tar.gz" } ] }