{ "info": { "author": "Matthew Rocklin", "author_email": "mrocklin@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "Dask-Spark\n==========\n\nLaunch Dask from Spark and Spark from Dask. This project is not mature.\n\n\nExamples\n--------\n\n::\n\n pip install dask-spark\n\nCreate Spark cluster from a Dask cluster\n\n.. code-block:: python\n\n >>> from dask.distributed import Client\n >>> client = Client('scheduler-address:8786')\n >>> client\n \n\n >>> from dask_spark import dask_to_spark\n >>> sc = dask_to_spark(client)\n >>> sc\n \n\nCreate Dask cluster from a Spark cluster\n\n.. code-block:: python\n\n >>> import pyspark\n >>> sc = pyspark.SparkContext('local[4]')\n \n\n >>> from dask_spark import spark_to_dask\n >>> client = spark_to_dask(sc)\n >>> client\n \n\n\nRequirements and How this Works\n-------------------------------\n\nThis depends on a relatively recent version of Dask.distributed.\n\nFor starting Spark from Dask this assumes that you have Spark installed and\nthat the ``start-master.sh`` and ``start-slave.sh`` Spark scripts are available\non the PATH of the workers. This starts a long-running Spark master process on\nthe Dask Scheduler and starts long running Spark slaves on Dask workers. There\nwill only be one slave per worker. We set the number of cores and the amount\nof memory to match the Dask workers and available memory.\n\nWhen starting Dask from Spark this will block the Spark cluster. We start a\nscheduler on the local machine and then run a long-running function that starts\nup a Dask worker using ``RDD.mapPartitions``.\n\n\nTODO\n----\n\n- [ ] This almost certainly fails in non-trivial situations\n- [ ] Enable user specification of Java flags for memory and core use\n- [ ] Support multiple spark clusters per Dask cluster\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "dask-spark", "package_url": "https://pypi.org/project/dask-spark/", "platform": "", "project_url": "https://pypi.org/project/dask-spark/", "project_urls": null, "release_url": "https://pypi.org/project/dask-spark/0.0.2/", "requires_dist": [ "dask (>=0.14.0)", "distributed (>=1.16.0)" ], "requires_python": "", "summary": "Interactions between Dask and Spark", "version": "0.0.2" }, "last_serial": 2672290, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "00e95e016c443e4b7bc1c13ab1ccb7d0", "sha256": "76d9f344356742994da1884cbd5509c1e70e1cae4e7bdd2566eebd803233e739" }, "downloads": -1, "filename": "dask_spark-0.0.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "00e95e016c443e4b7bc1c13ab1ccb7d0", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 5194, "upload_time": "2017-02-26T21:57:31", "url": "https://files.pythonhosted.org/packages/53/dc/0e234f60469840d63f5748298c0495c0c2c3558b845c5fa3c414b6eb66fd/dask_spark-0.0.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "7140d36017a628be3f91ffebb77fbc27", "sha256": "73235fbca522b3d933fdc257d52663c0798ccb7036e216357fba92a37541555a" }, "downloads": -1, "filename": "dask-spark-0.0.1.tar.gz", "has_sig": false, "md5_digest": "7140d36017a628be3f91ffebb77fbc27", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3540, "upload_time": "2017-02-26T21:57:32", "url": "https://files.pythonhosted.org/packages/d3/4b/a58624623d9bc86f66e0be31c81324a4ab62dde80832269bb1f5a32837ad/dask-spark-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "1268ec75445356624e5015f3ff723437", "sha256": "119c00115f21d793671c385d9ca44e480130374cb40140132cef82517ba3def3" }, "downloads": -1, "filename": "dask_spark-0.0.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "1268ec75445356624e5015f3ff723437", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 5297, "upload_time": "2017-02-28T03:45:00", "url": "https://files.pythonhosted.org/packages/71/aa/8473e64e11fd8c11374b2d0c7924b678b16a412a0ca858e69cbe85a3450d/dask_spark-0.0.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0211073cd5e64b9d6445905ffcecec3d", "sha256": "cac382297f17cc48308baea9f3277cae46c3a5d3a8ee1be4f68720a10e438a0c" }, "downloads": -1, "filename": "dask-spark-0.0.2.tar.gz", "has_sig": false, "md5_digest": "0211073cd5e64b9d6445905ffcecec3d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3623, "upload_time": "2017-02-28T03:45:01", "url": "https://files.pythonhosted.org/packages/a8/0d/dd3429bba41a7ca00c81b8f9e97baf43439c8d9b4ba4f072d34888905b20/dask-spark-0.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "1268ec75445356624e5015f3ff723437", "sha256": "119c00115f21d793671c385d9ca44e480130374cb40140132cef82517ba3def3" }, "downloads": -1, "filename": "dask_spark-0.0.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "1268ec75445356624e5015f3ff723437", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 5297, "upload_time": "2017-02-28T03:45:00", "url": "https://files.pythonhosted.org/packages/71/aa/8473e64e11fd8c11374b2d0c7924b678b16a412a0ca858e69cbe85a3450d/dask_spark-0.0.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0211073cd5e64b9d6445905ffcecec3d", "sha256": "cac382297f17cc48308baea9f3277cae46c3a5d3a8ee1be4f68720a10e438a0c" }, "downloads": -1, "filename": "dask-spark-0.0.2.tar.gz", "has_sig": false, "md5_digest": "0211073cd5e64b9d6445905ffcecec3d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3623, "upload_time": "2017-02-28T03:45:01", "url": "https://files.pythonhosted.org/packages/a8/0d/dd3429bba41a7ca00c81b8f9e97baf43439c8d9b4ba4f072d34888905b20/dask-spark-0.0.2.tar.gz" } ] }