{ "info": { "author": "Globo.com", "author_email": "backstage7@corp.globo.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Programming Language :: Python :: 2.7" ], "description": "MicroDrill\n##########\nSimple `Apache Drill\n`_\nalternative using `PySpark\n`_\ninspired by `PyDAL\n`_\n\n\nSetup\n=====\nRun terminal command ``pip install microdrill``\n\n\nDependencies\n============\nPySpark was tested with `Spark 1.6\n`_\n\n\nUsage\n=====\n\nDefining Query Parquet Table\n____________________________\n``ParquetTable(table_name, schema_index_file=file_name)``\n\n* table_name: Table referenced name.\n* file_name: File name to search for table schema.\n\nUsing Parquet DAL\n_________________\n``ParquetDAL(file_uri, sc)``\n\n* file_uri: It can be the path to files or ``hdfs://`` or any other location\n* sc: Spark Context (https://spark.apache.org/docs/1.6.0/api/python/pyspark.html#pyspark.SparkContext)\n\nConnecting in tables\n_____________________\n| ``parquet_conn = ParquetDAL(file_uri, sc)``\n| ``parquet_table = ParquetTable(table_name, schema_index_file=file_name)``\n| ``parquet_conn.set_table(parquet_table)``\n\nQueries\n_______\nReturning Table Object\n**********************\n``parquet_conn(table_name)``\n\nReturning Field Object\n**********************\n``parquet_conn(table_name)(field_name)``\n\nBasic Query\n***********\n| ``parquet_conn.select(field_object, [field_object2, ...]).where(field_object=value)``\n| ``parquet_conn.select(field_object1, field_object2).where(field_object1==value1 & ~field_object2==value2)``\n| ``parquet_conn.select(field_object1, field_object2).where(field_object1!=value1 | field_object1.regexp(reg_exp))``\n\nGrouping By\n***********\n``parquet_conn.groupby(field_object1, [field_object2, ...])``\n\nOrdering By\n***********\n| ``parquet_conn.orderby(field_object1, [field_object2, ...])``\n| ``parquet_conn.orderby(~field_object)``\n\nLimiting\n********\n``parquet_conn.limit(number)``\n\nExecuting\n*********\n``df = parquet_conn.execute()``\nexecute() returns a `PySpark DataFrame.\n`_\n\nReturning Field Names From Schema\n*********************************\n``parquet_conn(table_name).schema()``\n\n\nDevelopers\n==========\nInstall latest jdk and run in terminal ``make setup``", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/globocom/MicroDrill", "keywords": "apache drill client parquet hbase pyspark", "license": "Apache", "maintainer": null, "maintainer_email": null, "name": "microdrill", "package_url": "https://pypi.org/project/microdrill/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/microdrill/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/globocom/MicroDrill" }, "release_url": "https://pypi.org/project/microdrill/0.0.3/", "requires_dist": null, "requires_python": null, "summary": "Simple Apache Drill alternative using PySpark", "version": "0.0.3" }, "last_serial": 1984031, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "bc3e5ecc964d941009a06cc4442babf3", "sha256": "7a1e65a436258fc49173da6100b0a6cb6e69ee1c4a03d64f1fc622265333b234" }, "downloads": -1, "filename": "microdrill-0.0.1.tar.gz", "has_sig": false, "md5_digest": "bc3e5ecc964d941009a06cc4442babf3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7008, "upload_time": "2016-02-24T18:52:39", "url": "https://files.pythonhosted.org/packages/54/a3/f778e45873612f1828e1446ece5942ea08532099ec24873e359dd9c635d3/microdrill-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "0221ae2911abfb51b0a511fff0f96116", "sha256": "4f97d2d54eb4601be051350e2ab73758e84077941484adb101cac5ccb0fa622a" }, "downloads": -1, "filename": "microdrill-0.0.2.tar.gz", "has_sig": false, "md5_digest": "0221ae2911abfb51b0a511fff0f96116", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7943, "upload_time": "2016-02-25T14:11:51", "url": "https://files.pythonhosted.org/packages/d8/a4/af7288704d5c8ca237b7bae202b01db0a192979ebdb9b427b2488469e289/microdrill-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "2319b58847b3cf34879b97ab4d88c7dd", "sha256": "0ffa274b046a602225909f4cdaa7943be64f7dc99217583d37456468d303ca73" }, "downloads": -1, "filename": "microdrill-0.0.3.tar.gz", "has_sig": false, "md5_digest": "2319b58847b3cf34879b97ab4d88c7dd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7885, "upload_time": "2016-03-01T14:29:38", "url": "https://files.pythonhosted.org/packages/4e/94/12aab5729bafac62bc3f036eaf60ae2a6e7be7dcbdd243817ee463e1d27a/microdrill-0.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "2319b58847b3cf34879b97ab4d88c7dd", "sha256": "0ffa274b046a602225909f4cdaa7943be64f7dc99217583d37456468d303ca73" }, "downloads": -1, "filename": "microdrill-0.0.3.tar.gz", "has_sig": false, "md5_digest": "2319b58847b3cf34879b97ab4d88c7dd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7885, "upload_time": "2016-03-01T14:29:38", "url": "https://files.pythonhosted.org/packages/4e/94/12aab5729bafac62bc3f036eaf60ae2a6e7be7dcbdd243817ee463e1d27a/microdrill-0.0.3.tar.gz" } ] }