{ "info": { "author": "William Teo", "author_email": "eterna2@hotmail.com", "bugtrack_url": null, "classifiers": [], "description": "# databricks-utils\n[![Python version](https://img.shields.io/badge/python-3.6-blue.svg)](https://shields.io/)\n[![Pyspark version](https://img.shields.io/badge/pyspark-2.3.1-blue.svg)](https://shields.io/)\n[![Build Status](https://travis-ci.org/e2fyi/databricks-utils.svg?branch=master)](https://travis-ci.org/e2fyi/databricks-utils)\n\n`databricks-utils` is a python package that provide several utility classes/func\nthat improve ease-of-use in databricks notebook.\n\n### Installation\n```bash\npip install databricks-utils\n```\n\n### Features\n- `S3Bucket` class to easily interact with a [S3 bucket](https://aws.amazon.com/s3/) via [`dbfs`](https://docs.databricks.com/user-guide/dbfs-databricks-file-system.html) and databricks spark.\n\n- `vega_embed` to render charts from [Vega](https://vega.github.io/vega/) and [Vega-Lite](https://vega.github.io/vega-lite/) specifications.\n\n### Documentation\nAPI documentation can be found at [https://e2fyi.github.io/databricks-utils/](https://e2fyi.github.io/databricks-utils/).\n\n\n### Quick start\n**S3Bucket** \n```python\nimport json\nfrom databricks_utils.aws import S3Bucket\n\n# need to attach notebook's dbutils\n# before S3Bucket can be used\nS3Bucket.attach_dbutils(dbutils)\n\n# create an instance of the s3 bucket\nbucket = (S3Bucket(\"somebucketname\", \"SOMEACCESSKEY\", \"SOMESECRETKEY\")\n .allow_spark(sc) # local spark context\n .mount(\"somebucketname\")) # mount location name (resolves as `/mnt/somebucketname`)\n\n# show list of files/folders in the bucket \"resource\" folder\nbucket.ls(\"resource/\")\n\n# read in a json file from the bucket\ndata = json.load(open(bucket.local(\"resource/somefile.json\", \"r\")))\n\n# read from parquet via spark\ndataframe = spark.read.parquet(bucket.s3(\"resource/somedf.parquet\"))\n\n# umount\nbucket.umount()\n```\n\n**Vega** \n[Vega](https://vega.github.io/vega/) and [Vega-Lite](https://vega.github.io/vega-lite/)\nare high-level grammars of interactive graphics. They provide concise JSON\nsyntax for rapidly generating visualizations to support analysis.\n\n```python\nfrom databricks_utils.vega import vega_embed\n\n# vega-lite spec for a bar chart\nspec = {\n \"data\": {\n \"values\": [\n {\"a\": \"A\",\"b\": 28}, {\"a\": \"B\",\"b\": 55}, {\"a\": \"C\",\"b\": 43},\n {\"a\": \"D\",\"b\": 91}, {\"a\": \"E\",\"b\": 81}, {\"a\": \"F\",\"b\": 53},\n {\"a\": \"G\",\"b\": 19}, {\"a\": \"H\",\"b\": 87}, {\"a\": \"I\",\"b\": 52}\n ]\n },\n \"mark\": \"bar\",\n \"encoding\": {\n \"x\": {\"field\": \"a\", \"type\": \"ordinal\"},\n \"y\": {\"field\": \"b\", \"type\": \"quantitative\"}\n }\n}\n\n# plot out the vega chart in databricks notebook\ndisplayHTML(vega_embed(spec=spec))\n```\n\n### Developer\n```bash\n# add a version to git tag and publish to pypi\n. add_tag.sh \n```", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/e2fyi/databricks-utils", "keywords": "", "license": "Apache License 2.0", "maintainer": "", "maintainer_email": "", "name": "databricks-utils", "package_url": "https://pypi.org/project/databricks-utils/", "platform": "", "project_url": "https://pypi.org/project/databricks-utils/", "project_urls": { "Homepage": "https://github.com/e2fyi/databricks-utils" }, "release_url": "https://pypi.org/project/databricks-utils/0.0.7/", "requires_dist": null, "requires_python": "", "summary": "Ease-of-use utility tools for databricks notebooks.", "version": "0.0.7" }, "last_serial": 4025261, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "424928727b3e06cce2228f60843da0a0", "sha256": "843ab5aaf45337ae7516df01f6e0e755d815ccd0c0fb2f2d20417baebe83c82d" }, "downloads": -1, "filename": "databricks-utils-0.0.2.tar.gz", "has_sig": false, "md5_digest": "424928727b3e06cce2228f60843da0a0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4022, "upload_time": "2018-07-02T03:38:56", "url": "https://files.pythonhosted.org/packages/a4/60/5aa624b384a04cb1e425ea30015e91d7fbc225679ca8043c5b08994d7f62/databricks-utils-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "bc6eba0eb6e360d69958a1bd9de36e95", "sha256": "32decd49e2b34b87d0a9cc001f6a1adacc8447bfdc3d7963d62c2fb910925f7e" }, "downloads": -1, "filename": "databricks-utils-0.0.3.tar.gz", "has_sig": false, "md5_digest": "bc6eba0eb6e360d69958a1bd9de36e95", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4365, "upload_time": "2018-07-02T04:47:30", "url": "https://files.pythonhosted.org/packages/3e/44/f75df7556837f4a3153f2e9187da2eee83b32652dcc4302b3513bd7d22d6/databricks-utils-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "47238047f3cc9afc7b1faa2707be175b", "sha256": "1d4c088770ffd313d39f2fce6597f22851bab552cd15b14a38cbc08dc1a454da" }, "downloads": -1, "filename": "databricks-utils-0.0.4.tar.gz", "has_sig": false, "md5_digest": "47238047f3cc9afc7b1faa2707be175b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4611, "upload_time": "2018-07-03T03:30:10", "url": "https://files.pythonhosted.org/packages/07/c2/e4e78832c9703a27c482bb94756b6b217a06e329b5c45b45e5943b8f5949/databricks-utils-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "13d007d452ddf1345166cfd88ed8aeff", "sha256": "42d91ce6a3c1766f4a5b1bed91ea1128827ed8c98a17d08666b01c7dce2c2176" }, "downloads": -1, "filename": "databricks-utils-0.0.5.tar.gz", "has_sig": false, "md5_digest": "13d007d452ddf1345166cfd88ed8aeff", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4639, "upload_time": "2018-07-03T05:41:51", "url": "https://files.pythonhosted.org/packages/70/00/745abc62d37c048d5fe9f68c7934adbd3313c2d18b5bd0747c8113719cb9/databricks-utils-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "3dee9cbb7153f6bfcb76d8cd57013453", "sha256": "f5049b60bd345601efc149f2f10ae6402f2c00c38d993b339182890cef604f69" }, "downloads": -1, "filename": "databricks-utils-0.0.6.tar.gz", "has_sig": false, "md5_digest": "3dee9cbb7153f6bfcb76d8cd57013453", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4517, "upload_time": "2018-07-03T06:02:44", "url": "https://files.pythonhosted.org/packages/be/8b/e6fde959b24da6d1f307c9eaab67af1f7ed670cad9971da60a7858fc6c24/databricks-utils-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "fe61aea95875a9ae324e75ecf832c792", "sha256": "0dfe371cbdc65f29cebdb1ff99905b28addea579500ab3cf29a278c11f66b4ca" }, "downloads": -1, "filename": "databricks-utils-0.0.7.tar.gz", "has_sig": false, "md5_digest": "fe61aea95875a9ae324e75ecf832c792", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4542, "upload_time": "2018-07-03T07:25:29", "url": "https://files.pythonhosted.org/packages/89/05/4e40e0546bd2415b3fb38eab0d7fd48bead8877cf6121b5e64dc5401c69b/databricks-utils-0.0.7.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "fe61aea95875a9ae324e75ecf832c792", "sha256": "0dfe371cbdc65f29cebdb1ff99905b28addea579500ab3cf29a278c11f66b4ca" }, "downloads": -1, "filename": "databricks-utils-0.0.7.tar.gz", "has_sig": false, "md5_digest": "fe61aea95875a9ae324e75ecf832c792", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4542, "upload_time": "2018-07-03T07:25:29", "url": "https://files.pythonhosted.org/packages/89/05/4e40e0546bd2415b3fb38eab0d7fd48bead8877cf6121b5e64dc5401c69b/databricks-utils-0.0.7.tar.gz" } ] }