{ "info": { "author": "Konstantin Tretyakov", "author_email": "kt@ut.ee", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Topic :: Software Development :: Code Generators" ], "description": "SKompiler: Translate trained SKLearn models to executable code in other languages\n================================================================================\n\n[![Build Status](https://travis-ci.org/konstantint/SKompiler.svg?branch=master)](https://travis-ci.org/konstantint/SKompiler)\n\nThe package provides a tool for transforming trained SKLearn models into other forms, such as SQL queries, Excel formulas, Portable Format for Analytics (PFA) files or Sympy expressions (which, in turn, can be translated to code in a variety of languages, such as C, Javascript, Rust, Julia, etc).\n\nRequirements\n------------\n\n - Python 3.5 or later\n\nInstallation\n------------\n\nThe simplest way to install the package is via `pip`:\n\n $ pip install SKompiler[full]\n\n\nNote that the `[full]` option includes the installations of `sympy`, `sqlalchemy` and `astor`, which are necessary if you plan to convert `SKompiler`'s expressions to `sympy` expressions (which, in turn, can be compiled to many other languages) or to SQLAlchemy expressions (which can be further translated to different SQL dialects) or to Python source code. If you do not need this functionality (say, you only need the raw `SKompiler` expressions or perhaps only the SQL conversions without the `sympy` ones), you may avoid the forced installation of all optional dependencies by simply writing\n\n $ pip install SKompiler\n\n(you are free to install any of the required extra dependencies, via separate calls to `pip install`, of course)\n\nUsage\n-----\n\n### Introductory example\n\nLet us start by walking through an introductory example. We begin by training a model on a small dataset:\n\n from sklearn.datasets import load_iris\n from sklearn.ensemble import RandomForestClassifier\n X, y = load_iris(True)\n m = RandomForestClassifier(n_estimators=3, max_depth=3).fit(X, y)\n\nSuppose we need to express the logic of `m.predict` in SQLite. Here is how we can achieve that:\n\n from skompiler import skompile\n expr = skompile(m.predict)\n sql = expr.to('sqlalchemy/sqlite')\n\nVoila, the value of the `sql` variable is a query, which would compute the value of `m.predict` in pure SQL:\n\n WITH _tmp1 AS\n (SELECT .... FROM data)\n _tmp2 AS\n ( ... )\n SELECT ... from _tmp2 ...\n\nLet us import the data into an in-memory SQLite database to test the generated query:\n\n import sqlalchemy as sa\n import pandas as pd\n conn = sa.create_engine('sqlite://').connect()\n df = pd.DataFrame(X, columns=['x1', 'x2', 'x3', 'x4']).reset_index()\n df.to_sql('data', conn)\n\nOur database now contains the table named `data` with the primary key `index`. We need to provide this information to SKompiler to have it generate the correct query:\n\n sql = expr.to('sqlalchemy/sqlite', key_column='index', from_obj='data')\n\nWe can now query the data:\n\n results = pd.read_sql(sql, conn)\n \nand verify that the results match:\n\n assert (results.values.ravel() == m.predict(X).ravel()).all()\n\nNote that the generated SQL expression uses names `x1`, `x2`, `x3` and `x4` to refer to the input variables.\nWe could have chosen different input variable names by writing:\n\n expr = skompile(m.predict, ['a', 'b', 'c', 'd'])\n\n### Single-shot computation\n\nNote that the generated SQL code splits the computation into sequential steps using `with` expressions. In some cases you might want to have the whole computation \"inlined\" into a single expression. You can achieve this by specifying\n`multistage=False`:\n\n sql = expr.to('sqlalchemy/sqlite', multistage=False)\n\nNote that in this case the resulting expression would typically be several times longer than the multistage version:\n\n len(expr.to('sqlalchemy/sqlite'))\n > 2262\n len(expr.to('sqlalchemy/sqlite', multistage=False))\n > 12973\n\nWhy so? Because, for a typical classifier (including the one used in this example)\n\n predict(x) = argmax(predict_proba(x))\n\nThere is, however, no single `argmax` function in SQL, hence it has to be faked using the following logic:\n\n predict(x) = if predict_proba(x)[0] == max(predict_proba(x)) then 0\n else if predict_proba(x)[1] == max(predict_proba(x)) then 1\n else 2\n\nIf SKompiler is not alowed to use a separate step to store the intermediate `predict_proba` outputs, it is forced to inline the same computation verbatim multiple times. To summarize, you should probably avoid the use of `multistage=False` in most cases.\n\n### Other formats\n\nBy changing the first parameter of the `.to()` call you may produce output in a variety of other formats besides SQLite:\n\n * `sqlalchemy`: raw SQLAlchemy expression (which is a dialect-independent way of representing SQL). Jokes aside, SQL is sometimes a totally valid choice for deploying models into production.\n \n Note that generated SQL may (depending on the chosen model and method) include functions `exp`, `log` and `sqrt`, which are not supported out of the box in SQLite. If you work with SQLite, you will need to [add them separately](https://stackoverflow.com/a/2108921/318964) via `create_function`. You can find an example of how this can be done in `tests/evaluators.py` in the SKompiler's source code.\n * `sqlalchemy/`: SQL string in any of the SQLAlchemy-supported dialects (`firebird`, `mssql`, `mysql`, `oracle`, `postgresql`, `sqlite`, `sybase`). This is a convenience feature for those who are lazy to figure out how to compile raw SQLAlchemy to actual SQL.\n * `excel`: Excel formula. Ever tried dragging a random forest equation down along the table? Fun! Check out [this short screencast](https://www.youtube.com/watch?v=7vUfa7W0NpY) to see how it can be done.\n \n _NB: The screencast was recorded using a previous version, where `multistage=False` was the default option_.\n * `pfa`: A dict with [PFA](http://dmg.org/pfa/) code.\n * `pfa/json` or `pfa/yaml`: PFA code as a JSON or YAML string for those who are lazy to write `json.dumps` or `yaml.dump`. PyYAML should be installed in the latter case, of course.\n * `sympy`: A SymPy expression. Ever wanted to take a derivative of your model symbolically?\n * `sympy/`: Code in the language ``, generated via SymPy. Supported values for `` are `c`, `cxx`, `rust`, `fortran`, `js`, `r`, `julia`, `mathematica`, `octave`. Note that the quality of the generated code varies depending on the model, language and the value of the `assign_to` parameter. Again, this is just a convenience feature, you will get more control by dealing with `sympy` code printers [manually](https://www.sympy.org/scipy-2017-codegen-tutorial/). \n \n _NB: Sympy translation does not support multistage mode at the moment, hence the resulting code will have repeated subexpressions (which can be extracted by means of Sympy itself, however)._\n\n * `python`: Python syntax tree (the same you'd get via `ast.parse`). This (and the following three options) are mostly useful for debugging and testing.\n * `python/code`: Python source code. The generated code will contain references to custom functions, such as `__argmax__`, `__sigmoid__`, etc. To execute the code you will need to provide these in the `locals` dictionary. See `skompiler.fromskast.python._eval_vars`.\n * `python/lambda`: Python callable function (primarily useful for debugging and testing). Equivalent to calling `expr.lambdify()`.\n * `string`: string, equivalent to `str(expr)`.\n\n### Other models\n\nSo far this has been a fun two-weekends-long project, hence translation is implemented for a limited number of models. The most basic ones (linear models, decision trees, forests, gradient boosting, PCA, KMeans, MLP, Pipeline and a couple of preprocessors) are covered, however, and this is already sufficient to compile nontrivial constructions. For example:\n\n m = Pipeline([('scale', StandardScaler()),\n ('dim_reduce', PCA(6)),\n ('cluster', KMeans(10)),\n ('classify', MLPClassifier([5, 4], 'tanh'))])\n\nEven though this particular example probably does not make much sense from a machine learning perspective, it would happily compile both to Excel and SQL forms none the less.\n\nRudimentary support for theano-backed Keras models (covering the basic `Dense` layer for now) is also available. The following would work, for example:\n\n import os\n os.environ['KERAS_BACKEND'] = 'theano'\n \n from keras.models import Sequential\n from keras.layers import Dropout, Dense\n m = Sequential([Dense(3, activation='tanh', input_shape=(4,)),\n Dense(3, activation='linear'),\n Dropout(0.5),\n Dense(3, activation='sigmoid')])\n\n from skompiler import skompile\n skompile(m.predict)\n\n### How it works\n\nThe `skompile` procedure translates a given method into an intermediate syntactic representation (called SKompiler AST or SKAST). This representation uses a limited number of operations so it is reasonably simple to translate it into other forms.\n\nIn principle, SKAST's utility is not limited to `sklearn` models. Anything you translate into SKAST becomes automatically compileable to whatever output backends are implemented in `SKompiler`. Generating raw SKAST is quite straightforward:\n\n from skompiler.dsl import ident, const\n expr = const([[1,2],[3,4]]) @ ident('x', 2) + 12\n expr.to('sqlalchemy/sqlite', 'result')\n > SELECT 1 * x1 + 2 * x2 + 12 AS result1, 3 * x1 + 4 * x2 + 12 AS result2 \n > FROM data\n\nYou can use `repr(expr)` on any SKAST expression to dump its unformatted internal representation for examination or `str(expr)` to get a somewhat-formatted view of it.\n\nIt is important to note, that for larger models (say, a random forest or a gradient boosted model with 500+ trees) the resulting SKAST expression tree may become deeper than Python's default recursion limit of 1000. As a result some translators may produce a `RecursionError` when processing such expressions. This can be solved by raising the system recursion limit to sufficiently high value:\n\n import sys\n sys.setrecursionlimit(10000)\n\nDevelopment\n-----------\n\nIf you plan to develop or debug the package, consider installing it by running:\n\n $ pip install -e .[dev]\n\nfrom within the source distribution. This will install the package in \"development mode\" and include extra dependencies, useful for development.\n\nYou can then run the tests by typing\n\n $ py.test\n \nat the root of the source distribution.\n\nContributing\n------------\n\nFeel free to contribute or report issues via Github:\n\n * https://github.com/konstantint/SKompiler\n\n\nCopyright & License\n-------------------\n\nCopyright: 2018, Konstantin Tretyakov.\nLicense: MIT", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/konstantint/SKompiler", "keywords": "sklearn datascience modelling deployment", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "SKompiler", "package_url": "https://pypi.org/project/SKompiler/", "platform": "", "project_url": "https://pypi.org/project/SKompiler/", "project_urls": { "Homepage": "https://github.com/konstantint/SKompiler" }, "release_url": "https://pypi.org/project/SKompiler/0.5.5/", "requires_dist": null, "requires_python": "", "summary": "Library for compiling trained SKLearn models into abstract expressions suitable for further compilation into executable code in various languages.", "version": "0.5.5" }, "last_serial": 4742413, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "bf681a86b3c4fd6d438e695b4605567b", "sha256": "7932c1360f1090a4b92fc4d579c40e154da18f1d359f3289d3164a12656a558c" }, "downloads": -1, "filename": "SKompiler-0.1.tar.gz", "has_sig": false, "md5_digest": "bf681a86b3c4fd6d438e695b4605567b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27510, "upload_time": "2018-12-12T03:45:43", "url": "https://files.pythonhosted.org/packages/4e/85/24c0c63e10e26315cb40b8216f7066749c005b838790e50bef0cef2c9ab9/SKompiler-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "3fdf3ce334e94a832efa6d6c5c631efe", "sha256": "7ed07af1d32d136cf07d1b23d96dab0dee5a0b88964eea69a7353dc1f8b1ef64" }, "downloads": -1, "filename": "SKompiler-0.2.tar.gz", "has_sig": false, "md5_digest": "3fdf3ce334e94a832efa6d6c5c631efe", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 33310, "upload_time": "2018-12-13T05:26:46", "url": "https://files.pythonhosted.org/packages/30/a5/deb8da65dfadcee99b883b5f6d5dde5904b5a31cac1a7895e11bfa1e2cc8/SKompiler-0.2.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "c1f244586962b7596363d61455994ced", "sha256": "c04d586b165332a5b86513c429056b26e379d0409db3a8eceebc854a95fd7f66" }, "downloads": -1, "filename": "SKompiler-0.3.tar.gz", "has_sig": false, "md5_digest": "c1f244586962b7596363d61455994ced", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37121, "upload_time": "2018-12-13T18:32:50", "url": "https://files.pythonhosted.org/packages/a9/3b/97785ab1727f34f7b7a7ac65d3f651040865160578dda9143e19f3cc9e13/SKompiler-0.3.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "c7c6b42bb9ebeb3bde0f530045db974d", "sha256": "95572d41d329eccafec9585d8593208f531ea116548d22d4455721c2c3c292e1" }, "downloads": -1, "filename": "SKompiler-0.3.1.tar.gz", "has_sig": false, "md5_digest": "c7c6b42bb9ebeb3bde0f530045db974d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 38028, "upload_time": "2018-12-13T20:53:14", "url": "https://files.pythonhosted.org/packages/f0/fd/a5e077134235e8c0a486fc7b91e550a52c18494c34e4281ed6104bb42fd5/SKompiler-0.3.1.tar.gz" } ], "0.3.1.post1": [ { "comment_text": "", "digests": { "md5": "daa7b35afa9beae1224a010d107ac2d9", "sha256": "d1f5b23c3b0e67db8fd7be381abf59cd6f6953b1cb61b166b4fb2394592b80f4" }, "downloads": -1, "filename": "SKompiler-0.3.1.post1.tar.gz", "has_sig": false, "md5_digest": "daa7b35afa9beae1224a010d107ac2d9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 36310, "upload_time": "2018-12-14T10:37:30", "url": "https://files.pythonhosted.org/packages/01/1a/9a178227da019df6cfe122731a45c439c2077ab2846436184622639854f7/SKompiler-0.3.1.post1.tar.gz" } ], "0.4": [ { "comment_text": "", "digests": { "md5": "ab4bbcf0022080543045836f3b1464dd", "sha256": "1e1db2fdada8233f4431bf6b355031fca73fb81ba496eea8e147f0fc6addab1d" }, "downloads": -1, "filename": "SKompiler-0.4.tar.gz", "has_sig": false, "md5_digest": "ab4bbcf0022080543045836f3b1464dd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 38673, "upload_time": "2018-12-15T00:34:56", "url": "https://files.pythonhosted.org/packages/bb/69/733b23993a5bd174203cbf6a6cd08a9e25ba6ad60034cf38e7d1a0f69dad/SKompiler-0.4.tar.gz" } ], "0.5": [ { "comment_text": "", "digests": { "md5": "ca67a8e0f8302f2028a40bfcef0ccfdc", "sha256": "eba06430342c15d8a8959e4f83143757f256736c25b35f7fa6d41c18189e6f55" }, "downloads": -1, "filename": "SKompiler-0.5.tar.gz", "has_sig": false, "md5_digest": "ca67a8e0f8302f2028a40bfcef0ccfdc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 43104, "upload_time": "2018-12-16T22:59:00", "url": "https://files.pythonhosted.org/packages/00/74/90d0ef07c796ec0352ea7c8952453c381a263d73b6f484b5571503310123/SKompiler-0.5.tar.gz" } ], "0.5.1": [ { "comment_text": "", "digests": { "md5": "c1ffcb85a5712b760f753e2d472275ef", "sha256": "aa4a08ee5cb5f730c3dcba563a652999f5f627d96e731f2f5e9c112bb2c1cf20" }, "downloads": -1, "filename": "SKompiler-0.5.1.tar.gz", "has_sig": false, "md5_digest": "c1ffcb85a5712b760f753e2d472275ef", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 45767, "upload_time": "2018-12-17T17:53:40", "url": "https://files.pythonhosted.org/packages/35/37/c0e7907c5ecdfcdfcf2fbbc3659d3dafbe3a475455036a14b72f86f92394/SKompiler-0.5.1.tar.gz" } ], "0.5.2": [ { "comment_text": "", "digests": { "md5": "6af882cebc6832f84a70d413195642e9", "sha256": "9ba149bba7c1b32cc5294a678c3060747b12485801892f0ce1c196dbade20d69" }, "downloads": -1, "filename": "SKompiler-0.5.2.tar.gz", "has_sig": false, "md5_digest": "6af882cebc6832f84a70d413195642e9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 45900, "upload_time": "2018-12-21T01:33:55", "url": "https://files.pythonhosted.org/packages/34/48/0dbccdd8577ca5406082dc91eeea2a71d074b6183e5a13a4d60fe2e00ff1/SKompiler-0.5.2.tar.gz" } ], "0.5.3": [ { "comment_text": "", "digests": { "md5": "03375fe65a4ea9c7f5cf92ae10bc5836", "sha256": "6ec9684b55990095cc65d45723bba9bddd2fdb8fa901d14128ce7964e5450d10" }, "downloads": -1, "filename": "SKompiler-0.5.3.tar.gz", "has_sig": false, "md5_digest": "03375fe65a4ea9c7f5cf92ae10bc5836", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 46034, "upload_time": "2018-12-21T18:44:55", "url": "https://files.pythonhosted.org/packages/59/cf/17a809b1aee05979a29e158fec9070f0b6ba27dbea70c788a3a688094c54/SKompiler-0.5.3.tar.gz" } ], "0.5.4": [ { "comment_text": "", "digests": { "md5": "86f038bbac69898d7d20ab2d891ca4cd", "sha256": "3ac4cc13ef97c045d4b608dd9cde84fc939dbf8c88a073c93288fb24ddc7043e" }, "downloads": -1, "filename": "SKompiler-0.5.4.tar.gz", "has_sig": false, "md5_digest": "86f038bbac69898d7d20ab2d891ca4cd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 48367, "upload_time": "2019-01-06T03:19:49", "url": "https://files.pythonhosted.org/packages/b9/64/758271aa11858b7fe3b7b55f59fa9f41a91f01f009b60c9d996ef9716a45/SKompiler-0.5.4.tar.gz" } ], "0.5.5": [ { "comment_text": "", "digests": { "md5": "89a6929ad2918cbe059843a1fd3ab4ce", "sha256": "9f3125fd2d52b4fb2cfe075c19cde205510d4d66c473ce2913d9a1a51d8e684a" }, "downloads": -1, "filename": "SKompiler-0.5.5.tar.gz", "has_sig": false, "md5_digest": "89a6929ad2918cbe059843a1fd3ab4ce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 48427, "upload_time": "2019-01-26T00:39:09", "url": "https://files.pythonhosted.org/packages/77/31/cbc7fc4fe064425efbf2044b94bd1973472d8aac85b4d9346c7330e03dfb/SKompiler-0.5.5.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "89a6929ad2918cbe059843a1fd3ab4ce", "sha256": "9f3125fd2d52b4fb2cfe075c19cde205510d4d66c473ce2913d9a1a51d8e684a" }, "downloads": -1, "filename": "SKompiler-0.5.5.tar.gz", "has_sig": false, "md5_digest": "89a6929ad2918cbe059843a1fd3ab4ce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 48427, "upload_time": "2019-01-26T00:39:09", "url": "https://files.pythonhosted.org/packages/77/31/cbc7fc4fe064425efbf2044b94bd1973472d8aac85b4d9346c7330e03dfb/SKompiler-0.5.5.tar.gz" } ] }