{ "info": { "author": "Dylan Albrecht", "author_email": "djalbrecht@email.wm.edu", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved", "Operating System :: MacOS", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX", "Operating System :: Unix", "Programming Language :: Python", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering", "Topic :: Software Development" ], "description": "![fast](images/leapy.gif)\n\n**Just give me the dataz!**\n\nWelcome! Leapy is a library for real-time, sub-millisecond inference;\nit provides customizable machine learning pipeline export for fast\nmodel serving. \nThese pipelines are targeted for using Dask's scalable machine learning,\nwhich follows the Scikit-Learn API. However, you can use this framework\ndirectly with Scikit-Learn as well.\n\n```python\npipe = Pipeline([\n ('fp', FeaturePipeline([('ohe',\n OneHotEncoder(sparse=False),\n [0, 1])])),\n ('clf', LogisticRegression())\n])\n\npipe.fit(X, y)\n\npipeline_runtime = pipe.to_runtime() # \u26a1\u26a1\u26a1 \ninit('./model_repo', pipeline_runtime, df.head()) # Ready to deploy\n```\n\nAnd serve this super fast pipeline:\n```\n$ leap serve --repo ./model_repo\n$ curl localhost:8080:/health\n{\n \"status\": \"healthy\"\n}\n```\n(See below for benchmarks and a more detailed usage example.)\n\n### Benefits\n\nDask is a Python distributed computing environment in Python with a\nburgeoning machine learning component, compatible with Scikit-Learn's API.\nUsing Leapy's framework, we can serve these pipelines in real-time!\n\nThis means:\n\n* No JVM: No reliance on JVM from using Spark.\n* Python: All Python development and custom transformers -- no Scala & Java\n needed!\n* Scale: Scikit-Learn logic and pipelines scaled to clusters.\n* Fast: You're in control of how fast your transformers are.\n* Simple: Easily build and deploy models with Docker.\n* Reliable: Encourages a test-driven approach.\n\n\n### Examples\n\n* [Simple](examples/simple)\n -- Super simple example of creating, testing, and using custom\n transformers\n* [XGBoost](examples/)\n -- Advanced example of using XGBoost with Dask, saving, and serving the\n model.\n\n### Benchmarks\n\nA simple example of what we're going for -- computing a one-hot-encoding,\nwith ~200K labels, of a single data point (dask array `x_da` and numpy array\n`x = x_da.compute()`):\n\n![sample benchmark](images/sample_benchmark.png)\n\nWhere `ohe_dml` (from `dask_ml`) and `ohe` (from `leapy`) are essentially the\nsame; `ohe_sk` is from `scikit-learn` and `ohe_runtime` is from\n`ohe.to_runtime()`. And, running `compute()` on Dask transforms above bumps\nthe time up to about 1 second.\n\nWith the time we save using `ohe_runtime`, we can squeeze in many more\ntransformations and an estimator to still come in under 1ms.\n\n### Example Usage\n\nStart with a dataset in dask arrays, `X`, `y`, and dataframe `ddf`:\npipeline:\n```python\nimport numpy as np\nimport pandas as pd\nimport dask.array as da\nimport dask.dataframe as dd\n\nX_np = np.array([[1, 'a'], [2, 'b']], dtype=np.object)\ndf_pd = pd.DataFrame(X_np, columns=['test_int', 'test_str'])\ny_np = np.array([0, 1])\n\nX = da.from_array(X_np, chunks=X_np.shape)\ny = da.from_array(y_np, chunks=y_np.shape)\nddf = dd.from_pandas(df_pd, npartitions=1)\n```\n\nCreate our pipeline:\n\n```python\nfrom sklearn.pipeline import Pipeline\nfrom dask_ml.linear_model import LogisticRegression\nfrom leapy.dask.transformers import OneHotEncoder\nfrom leapy.dask.pipeline import FeaturePipeline\nfrom leapy.serve import init\n\npipe = Pipeline([\n ('fp', FeaturePipeline([\n # One-Hot-Encode 'test_str' feature, drop 'test_int'\n ('ohe', OneHotEncoder(sparse=False), [1])])),\n ('clf', LogisticRegression())\n])\n\npipe.fit(X, y)\n```\n\nThen we export to a runtime pipeline and get ready for model serving:\n\n```python\npipe_runtime = pipe.to_runtime()\ninit('./model_repo', pipe_runtime, ddf.head())\n```\n\nFinally we serve the model:\n```\n$ leap serve --repo ./model_repo\n$ curl localhost:8080/predict \\\n -X POST \\\n -H \"Content-Type: application/json\" \\\n --data '{\"test_int\": 1, \"test_str\": \"b\"}'\n{\n \"prediction\": 1.0\n}\n```\n\n\nFor more on model serving see [leapy/serve/README.md](leapy/serve/README.md).\n\n## Acknowledgments\n\nLeapy is inspired by [MLeap](https://github.com/combust/mleap).\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/nonabelian/leapy", "keywords": "", "license": "new BSD", "maintainer": "", "maintainer_email": "", "name": "leapy", "package_url": "https://pypi.org/project/leapy/", "platform": "", "project_url": "https://pypi.org/project/leapy/", "project_urls": { "Homepage": "http://github.com/nonabelian/leapy" }, "release_url": "https://pypi.org/project/leapy/0.1.0/", "requires_dist": [ "pandas (>=0.24.2)", "dask (>=1.1.4)", "dask-ml (>=0.12.0)", "numpy (>=1.16.2)", "scikit-learn (>=0.20.3)", "marshmallow (>=2.19.2)", "docker (==3.7.2)" ], "requires_python": "", "summary": "Real-time inference pipelines", "version": "0.1.0" }, "last_serial": 5139817, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "7054f9d2c6196fc2a5d70a255b1afe94", "sha256": "bbcc02ef254e8a5efc7339bbeb483c44e506b8fe6d1dec1353405c5abf9c4108" }, "downloads": -1, "filename": "leapy-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "7054f9d2c6196fc2a5d70a255b1afe94", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 19421, "upload_time": "2019-04-10T01:05:32", "url": "https://files.pythonhosted.org/packages/9e/d8/91a8c5048a5391e00cfa07952a04f62dfd91ab07189947b64a4814b61813/leapy-0.0.2-py3-none-any.whl" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "74077f375afd35cdd335f9d9c513848a", "sha256": "a4a64cec1458298dfd49748ed85e394ae524f808aea79c37da746511d4bcb373" }, "downloads": -1, "filename": "leapy-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "74077f375afd35cdd335f9d9c513848a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20207, "upload_time": "2019-04-14T01:55:49", "url": "https://files.pythonhosted.org/packages/52/88/198f3fe2a29c83d7276920fde78aba6aa5471a324e12037bccda8a328fb9/leapy-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "64288db85895bf62b1d1fe611e833617", "sha256": "0f023f28ba4e546b2e9066a1c2d3ba68fe119b3f2b2854ff41fc44161bc79ac9" }, "downloads": -1, "filename": "leapy-0.0.3.tar.gz", "has_sig": false, "md5_digest": "64288db85895bf62b1d1fe611e833617", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13133, "upload_time": "2019-04-14T01:55:51", "url": "https://files.pythonhosted.org/packages/0f/8c/86407433706e84e22eaeb71ec9a60787f75b7ddc7da8f9d023c626126727/leapy-0.0.3.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "3cf01cae541ffe754699127296fecb3a", "sha256": "c92a7d0c6ae90fa341b2a358e014bdc0d4850da8b4214757e63f1675c8ce4ae5" }, "downloads": -1, "filename": "leapy-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "3cf01cae541ffe754699127296fecb3a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 22964, "upload_time": "2019-04-14T04:07:46", "url": "https://files.pythonhosted.org/packages/f5/16/cccd9d3210e1c1f7123ad11f59a2ebab800c4c484c9b657ffd80a3c0ac6b/leapy-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "797476fad7858de66a6766560795c6af", "sha256": "d7cc0102c8902b3c6364c66356d1aa3fee1387033ced01143b6e90901ae92a94" }, "downloads": -1, "filename": "leapy-0.1.0.tar.gz", "has_sig": false, "md5_digest": "797476fad7858de66a6766560795c6af", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15836, "upload_time": "2019-04-14T04:07:48", "url": "https://files.pythonhosted.org/packages/dd/23/28e3c4af3b633f5b4f4241818babf8d3f6820b88e4e7c4ddd6640898912b/leapy-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "3cf01cae541ffe754699127296fecb3a", "sha256": "c92a7d0c6ae90fa341b2a358e014bdc0d4850da8b4214757e63f1675c8ce4ae5" }, "downloads": -1, "filename": "leapy-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "3cf01cae541ffe754699127296fecb3a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 22964, "upload_time": "2019-04-14T04:07:46", "url": "https://files.pythonhosted.org/packages/f5/16/cccd9d3210e1c1f7123ad11f59a2ebab800c4c484c9b657ffd80a3c0ac6b/leapy-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "797476fad7858de66a6766560795c6af", "sha256": "d7cc0102c8902b3c6364c66356d1aa3fee1387033ced01143b6e90901ae92a94" }, "downloads": -1, "filename": "leapy-0.1.0.tar.gz", "has_sig": false, "md5_digest": "797476fad7858de66a6766560795c6af", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15836, "upload_time": "2019-04-14T04:07:48", "url": "https://files.pythonhosted.org/packages/dd/23/28e3c4af3b633f5b4f4241818babf8d3f6820b88e4e7c4ddd6640898912b/leapy-0.1.0.tar.gz" } ] }