{ "info": { "author": "Drew J. Sonne", "author_email": "drew.sonne@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# spark-monitoring\n\nA python library to interact with the Spark History server.\n\n## Quickstart\n\n### Basic\n\n```shell\n$ pip install spark-monitoring\n```\n```python\nimport sparkmonitoring as sparkmon\n\nmonitoring = sparkmon.client('my.history.server')\nprint(monitoring.list_applications())\n```\n\n### Pandas\n\n```shell\n$ pip install spark-monitoring[pandas]\n```\n\n```python\nimport sparkmonitoring as sparkmon\nimport matplotlib.pyplot as plt\n\nmonitoring = sparkmon.df('my.history.server')\n\napps = monitoring.list_applications()\napps['function'] = apps.name.str.split('(').str.get(0)\nprint(apps.head().stack())\n\nplt.figure()\napps['duration'].hist(by=apps['function'], figsize=(40, 20))\nplt.show()\n\njobs = monitoring.list_jobs(apps.iloc[0].id)\n\nprint(jobs.head().stack())\n```\n\n## Reference\n\n### sparkmonitoring.client\n\nMethod to return a client to make calls to the spark history server with.\n\n#### Arguments\n\n| Name | Type | Description | Default |\n|------|------|-------------|---------|\n| `server` | `string` | Hostname or IP pointing to the spark history server | |\n| `port` | `int` | Port which the spark history server is exposed on | `18080` |\n| `is_https` | `bool` | Whether or not to use https to communicate with the spark server | `False`\n| `api_version` | `int` | API Version to interact with. Currently only `1` is supported | `1` |\n\n#### Response\n\n - [`sparkmonitoring.api.ClientV1`](#sparkmonitoringapiclientv1)\n\n#### Examples\n_Basic Endpoint_\n```python\nimport sparkmonitoring as sparkmon\nclient = sparkmon.client('my.history.server')\n```\n\n_Custom Endpoint_\n```python\nimport sparkmonitoring as sparkmon\nclient = sparkmon.client('my.history.server', port=8080, is_https=True)\n```\n\n### sparkmonitoring.df\n\nMethod to return a client to make calls to the spark history server with. This\nclient will return pandas dataframes, as opposed ot dictionaries in the\nstandard client. Can be used when the `spark-monitoring[pandas]` extra is \ninstalled.\n\n#### Arguments\n\n| Name | Type | Description | Default |\n|------|------|-------------|---------|\n| `server` | `string` | Hostname or IP pointing to the spark history server | |\n| `port` | `int` | Port which the spark history server is exposed on | `18080` |\n| `is_https` | `bool` | Whether or not to use https to communicate with the spark server | `False`\n| `api_version` | `int` | API Version to interact with. Currently only `1` is supported | `1` |\n\n#### Response\n\n - [`sparkmonitoring.dataframes.PandasClient`](#sparkmonitoringdataframespandasclient)\n\n#### Examples\n_Basic Endpoint_\n```python\nimport sparkmonitoring as sparkmon\nclient = sparkmon.df('my.history.server')\n```\n\n_Custom Endpoint_\n```python\nimport sparkmonitoring as sparkmon\nclient = sparkmon.df('my.history.server', port=8080, is_https=True)\n\n```\n\n### sparkmonitoring.api.ClientV1\n\nA client to interact with the Spark History Server.\nGenerally this class is not instantiated directly, and is accessed via\n[`sparkmonitoring.client(...)`](#sparkmonitoringclient).\n\n#### Arguments\n\n| Name | Type | Description | Default |\n|------|------|-------------|---------|\n| `server` | `string` | Hostname or IP pointing to the spark history server | |\n| `port` | `int` | Port which the spark history server is exposed on | |\n| `is_https` | `bool` | Whether or not to use https to communicate with the spark server | | \n| `api_version` | `int` | API Version to interact with. Currently only `1` is supported | |\n\n#### Methods\n\n - [`list_applications(...)`](#sparkmonitoringdataframespandasclientlist_applications)\n - `get_application(...)`\n - `list_jobs(...)`\n - `get_job(...)`\n - `list_stages(...)`\n - `list_stage_attempts(...)`\n - `get_stage_attempt(...)`\n - `get_stage_attempt_summary(...)`\n - `get_stage_attempt_tasks(...)`\n - `list_active_executors(...)`\n - `list_executor_threads(...)`\n - `list_all_executors(...)`\n\n### sparkmonitoring.dataframes.PandasClient.list_applications\n\nA list of all applications.\n\n#### Arguments\n\n| Name | Type | Description | Default |\n|------|------|-------------|---------|\n| `status` | `enum{'completed','running'}` | Type of applications to return |\n| `minDate` | `string{ISO8601}` | Earliest Application |\n| `maxDate` | `string{ISO8601}` | Latest Application |\n| `limit` | `int` | Number of results to return |\n\n### sparkmonitoring.dataframes.PandasClient\n\nA client to interact with the Spark History Server, returning pandas\nDataFrames.\nGenerally this class is not instantiated directly, and is accessed via\n[`sparkmonitoring.df(...)`](#sparkmonitoringdf).\n\n#### Arguments\n\n\n| Name | Type | Description | Default |\n|------|------|-------------|---------|\n| `server` | `string` | Hostname or IP pointing to the spark history server | |\n| `port` | `int` | Port which the spark history server is exposed on | `18080` |\n| `is_https` | `bool` | Whether or not to use https to communicate with the spark server | `False`\n| `api_version` | `int` | API Version to interact with. Currently only `1` is supported | `1` |\n\n#### Methods\n\n - `list_applications(...)`\n - `get_application(...)`\n - `list_jobs(...)`\n - `get_job(...)`\n - `list_stages(...)`\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://bliseng.github.io/spark-monitoring/", "keywords": "", "license": "LGPL3", "maintainer": "", "maintainer_email": "", "name": "spark-monitoring", "package_url": "https://pypi.org/project/spark-monitoring/", "platform": "", "project_url": "https://pypi.org/project/spark-monitoring/", "project_urls": { "Homepage": "https://bliseng.github.io/spark-monitoring/" }, "release_url": "https://pypi.org/project/spark-monitoring/0.0.3/", "requires_dist": [ "requests" ], "requires_python": "", "summary": "A python library to interact with the Spark History server", "version": "0.0.3" }, "last_serial": 4939344, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "40ded4db1b55a8a9284431f9b6ed9071", "sha256": "bde313242e13ed27d42255e2ef653e4f481b3e645f7bded2e7ec35f9cbed09f5" }, "downloads": -1, "filename": "spark_monitoring-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "40ded4db1b55a8a9284431f9b6ed9071", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4612, "upload_time": "2018-12-18T11:30:31", "url": "https://files.pythonhosted.org/packages/94/0b/ba1504f844c091304915543a0e57ef8b166e411be10f0230cfe8f8993c57/spark_monitoring-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "88e8df9576788ec7dcc3ed4c9e48a495", "sha256": "b03cd85aa48d3b3de32397ef25968c7bf8b973e704d844460bc926db4657a687" }, "downloads": -1, "filename": "spark-monitoring-0.0.1.tar.gz", "has_sig": false, "md5_digest": "88e8df9576788ec7dcc3ed4c9e48a495", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4043, "upload_time": "2018-12-18T11:30:32", "url": "https://files.pythonhosted.org/packages/cd/49/9814acce50d62e967f5e1e3c8d48aba5a4e9f737b0f4fa95ba7886b824d0/spark-monitoring-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "abeef79558bd6a0bc4e120c6d61b64d2", "sha256": "c30f9de00cb030f4298db9b28483139598564f81473c333cbbddbff28f9f6fef" }, "downloads": -1, "filename": "spark_monitoring-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "abeef79558bd6a0bc4e120c6d61b64d2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5606, "upload_time": "2018-12-18T11:53:55", "url": "https://files.pythonhosted.org/packages/4f/34/c43def83a12b68826f2a5af6cadbf326f68864cdd4dd2b7ce46f08cb1a61/spark_monitoring-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "853c4dd8edf56c95613fb4b535dd1488", "sha256": "6baf3009408fa93d2811f59490b315d60078e99c0099d4c64c787264e20e47ab" }, "downloads": -1, "filename": "spark-monitoring-0.0.2.tar.gz", "has_sig": false, "md5_digest": "853c4dd8edf56c95613fb4b535dd1488", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4581, "upload_time": "2018-12-18T11:53:57", "url": "https://files.pythonhosted.org/packages/4e/33/8710c68d0dccd0c7d1467736b740a0991bce0aa04d158fea448ec2bedbd6/spark-monitoring-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "23f8141660ebad20974e33b2ba4f49ab", "sha256": "fda5602c9aaba398e59f986b53d40277413509a0a93fa3c88a7809c206186df9" }, "downloads": -1, "filename": "spark_monitoring-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "23f8141660ebad20974e33b2ba4f49ab", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9063, "upload_time": "2019-03-14T12:11:26", "url": "https://files.pythonhosted.org/packages/d4/8f/d9125fd0fa2964029f056f4df1f5cc4990799db3a45d5261d3a70194a68f/spark_monitoring-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "723bf30d421fc532b4232f68c9e82ff7", "sha256": "db49a4b3333477cace69d0297c0f9361a6cce1d9562ab80b45b469357573b771" }, "downloads": -1, "filename": "spark-monitoring-0.0.3.tar.gz", "has_sig": false, "md5_digest": "723bf30d421fc532b4232f68c9e82ff7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4923, "upload_time": "2019-03-14T12:11:27", "url": "https://files.pythonhosted.org/packages/f4/39/3805522e5bc7f4e9720db883d5b65d48d209c8dfb786a669e681ea1ab8e8/spark-monitoring-0.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "23f8141660ebad20974e33b2ba4f49ab", "sha256": "fda5602c9aaba398e59f986b53d40277413509a0a93fa3c88a7809c206186df9" }, "downloads": -1, "filename": "spark_monitoring-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "23f8141660ebad20974e33b2ba4f49ab", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9063, "upload_time": "2019-03-14T12:11:26", "url": "https://files.pythonhosted.org/packages/d4/8f/d9125fd0fa2964029f056f4df1f5cc4990799db3a45d5261d3a70194a68f/spark_monitoring-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "723bf30d421fc532b4232f68c9e82ff7", "sha256": "db49a4b3333477cace69d0297c0f9361a6cce1d9562ab80b45b469357573b771" }, "downloads": -1, "filename": "spark-monitoring-0.0.3.tar.gz", "has_sig": false, "md5_digest": "723bf30d421fc532b4232f68c9e82ff7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4923, "upload_time": "2019-03-14T12:11:27", "url": "https://files.pythonhosted.org/packages/f4/39/3805522e5bc7f4e9720db883d5b65d48d209c8dfb786a669e681ea1ab8e8/spark-monitoring-0.0.3.tar.gz" } ] }