{ "info": { "author": "sarthfrey", "author_email": "sarth.frey@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 1 - Planning", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Software Development :: Version Control :: Git", "Typing :: Typed" ], "description": "# ensemble\n\nCombine models, *easily*.\n\n\"Model\n\n**ensemble** lets you combine your models and access them by a single object. You may use that ensemble to multiplex between your models, call them all, and aggregate the results. You can do *bagging*, *boosting*, *stacking*, and more. You may even create ensembles of ensembles!\n\nThis package borrows the idea of computation graph sessioning from [TensorFlow](https://github.com/tensorflow/tensorflow) and implements the [composite pattern](https://en.wikipedia.org/wiki/Composite_pattern) for building tree hierarchies.\n\n### Documentation\n\n[![Documentation Status](https://readthedocs.org/projects/ensemble-pkg/badge/?version=latest)](https://ensemble-pkg.readthedocs.io/en/latest/?badge=latest)\n\nRead the docs at [ensemble-pkg.readthedocs.io](https://ensemble-pkg.readthedocs.io)\n\n### Installation\n\n```\npip install ensemble-pkg\n```\n\n### Case Study\n\nLet's say we have two models that accomplish a binary classification task. We want to be able to easily combine the two models into an ensemble and then test the precision and recall of both models and the ensemble. With this package, it's easy.\n\nWe start off by building our two models for a dataset. In this case, the task is classifying if a number is divisble by 15. We have two models, one says the number is divisble by 15 if it is divisble by 3 and the other says so if it is divisble by 5.\n\n```python\ndef model1(x):\n return x % 3 == 0\n\ndef model2(x):\n return x % 5 == 0\n\ndef get_dataset():\n return [(i, i % 15 == 0) for i in range(1, 101)]\n```\n\nDefine a function that gets our precision and recall for a given dataset and set of predictions:\n\n```python\ndef get_results(dataset, preds):\n labels = [label for _, label in dataset]\n positives = sum(1 for label in labels if label)\n predicted_positives = sum(1 for pred in preds if pred)\n true_positives = sum(1 for label, pred in zip(labels, preds) if label and pred)\n return 100.0 * true_positives / predicted_positives, 100.0 * true_positives / positives\n```\n\nNext we build a model ensemble from `model1` and `model2`, specifically one that only outputs `True` if all its children output `True`. In this case, the ensemble would then only output `True` if the input is both divisible by 3 and 5.\n\n```python\ne = Ensemble('ensemble', children=[model1, model2], mode='all')\ne(x=3) # returns False\ne(x=5) # returns False\ne(x=15) # returns True\n```\n\nFinally, lets build another ensemble from our two models and the ensemble in order to easily aggregate the precision and recall stats for each model. We do this by decorating each child model with an evaluation decorator which modifies the models to take a dataset and output precision and recall, instead of taking a number and outputing `True` or `False`.\n\n```python\ndef evaluate(model):\n def wrapper(dataset):\n preds = [model(x=x) for x, _ in dataset]\n precision, recall = get_results(dataset, preds)\n return {\n 'precision': f'{precision:.1f}%',\n 'recall': f'{recall:.1f}%',\n }\n return wrapper\n\nresults = Ensemble('results', children=[model1, model2, e])\nresults.decorate_children(evaluate)\n```\n\nFinally we run `results(dataset=get_dataset())` and get the following results as expected!\n\n```\n{'ensemble': {'precision': '100.0%', 'recall': '100.0%'},\n 'model1': {'precision': '18.2%', 'recall': '100.0%'},\n 'model2': {'precision': '30.0%', 'recall': '100.0%'}}\n```\n\nIn a few keystrokes, we built the following graph structure.\n\n\"Graph\n\n### Examples\n\nDefine your model functions and create your ensemble:\n\n```python\n>>> from ensemble import Ensemble\n>>> def square(x):\n... return x**2\n>>> def cube(x):\n... return x**3\n>>> e = Ensemble(name='e1', children=[square, cube])\n```\n\nCall all the models in the ensemble:\n```python\n>>> e(x=2)\n{'square': 4, 'cube': 8}\n>>> e(x=3)\n{'square': 9, 'cube': 27}\n```\n\nMultiplex between functions:\n\n```python\n>>> e.multiplex('square', x=2)\n4\n>>> e.multiplex('cube', x=3)\n27\n```\n\nYou may instead decorate your model functions with `@child` in order to attach them to an ensemble:\n\n```python\n>>> from ensemble import child\n>>> @child('e2')\n... def func1(x):\n... return x**2\n...\n>>> @child('e2')\n... def func2(x):\n... return x**3\n...\n>>> e = Ensemble('e2')\n>>> e(x=3)\n{'func1': 9, 'func2': 27}\n```\n\nYou may even attach a model to multiple ensembles!\n\n```python\n>>> @child('e2', 'e3')\n... def func3(x, y):\n... return x**3 + y\n...\n>>> e2(x=2, y=3)\n{'func1': 4, 'func2': 8, 'func3': 11}\n>>>\n>>> e3 = Ensemble('e3')\n>>> e3(x=2, y=3)\n{'func3': 11}\n```\n\nCompute statstical aggregations from your ensemble's models:\n\n```python\n>>> def a(x):\n... return x + 1\n...\n>>> def b(y):\n... return y + 2\n...\n>>> def c(z):\n... return z + 2\n...\n>>> e = Ensemble('e4', children=[a, b], weights=[3.0, 1.0])\n>>> e.mean(x=2, y=3)\n4.0\n>>> e.weighted_mean(x=2, y=3)\n3.5\n>>> e.weighted_sum(x=2, y=3)\n14.0\n>>> e = Ensemble('e6', [a, b, c])\n>>> e.vote(x=1, y=1, z=1)\n3\n```\n\nBuild ensembles of ensembles!\n\n```python\n>>> first_ensemble = Ensemble('first', children=[c])\n>>> second_ensemble = Ensemble('second', children=[a, b])\n>>> parent_ensemble = Ensemble('parent', children=[first_ensemble, second_ensemble])\n>>> parent_ensemble(x=1, y=1, z=1)\n{'first': {'c': 3}, 'second': {'a': 2, 'b': 3}}\n>>> parent_ensemble.multiplex('second', x=3, y=1)\n{'a': 4, 'b': 3}\n```\n\nUse that idea to chain aggregate computations! Compute the mean of the sum of the model outputs in each ensemble:\n\n```python\n>>> first_ensemble.set_mode('sum')\nEnsemble(name='first', children=['c'], weights=None, mode='sum')\n>>> second_ensemble.set_mode('sum')\nEnsemble(name='second', children=['a', 'b'], weights=None, mode='sum')\n>>> parent_ensemble.mean(x=1, y=1, z=1)\n4.0\n```\n\nIf you forget what models are in your ensemble, just check:\n\n```python\n>>> print(parent_ensemble)\nEnsemble(name='parent', children=['first', 'second'], weights=None, mode='all')\n Ensemble(name='first', children=['c'], weights=None, mode='sum')\n Model(name='c', func=c(z))\n Ensemble(name='second', children=['a', 'b'], weights=None, mode='sum')\n Model(name='a', func=a(x))\n Model(name='b', func=b(y))\n```\n\nIn the above example, a tree is shown which shows which models and ensembles are the children of which ensembles!\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/sarthfrey/ensemble", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "ensemble-pkg", "package_url": "https://pypi.org/project/ensemble-pkg/", "platform": "", "project_url": "https://pypi.org/project/ensemble-pkg/", "project_urls": { "Homepage": "https://github.com/sarthfrey/ensemble" }, "release_url": "https://pypi.org/project/ensemble-pkg/0.0.3/", "requires_dist": [ "numpy (>=1.0.0)" ], "requires_python": "", "summary": "Combine models, easily.", "version": "0.0.3" }, "last_serial": 5630048, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "196ae8b45d15c0ede700db40b0053827", "sha256": "af42de81e4260284894bf1c6aff3318b9e6549945888a8111f681b512a906492" }, "downloads": -1, "filename": "ensemble_pkg-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "196ae8b45d15c0ede700db40b0053827", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6257, "upload_time": "2019-07-14T00:50:34", "url": "https://files.pythonhosted.org/packages/7f/61/0463c4a5f4351113809e52da0b74645419d5fc86a85ac0cf6109115b74fd/ensemble_pkg-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d94557c941e9c7bb67a0c52946db4d02", "sha256": "5e2a3de60b6f8cd8b454615c062a58658892d138cb432f73df610086b977a1b3" }, "downloads": -1, "filename": "ensemble-pkg-0.0.1.tar.gz", "has_sig": false, "md5_digest": "d94557c941e9c7bb67a0c52946db4d02", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4442, "upload_time": "2019-07-14T00:50:37", "url": "https://files.pythonhosted.org/packages/b8/e0/a8cf72e9f7172eee27ba7d2209623fb6a37b2253e6cfbeb946318759b755/ensemble-pkg-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "1c1a7fdb7f023c48579ec35dceffdfe5", "sha256": "403c0c1491d6da002a89fa414a91126e0f06b700554bccbb2aacc7d1b17e6f23" }, "downloads": -1, "filename": "ensemble_pkg-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "1c1a7fdb7f023c48579ec35dceffdfe5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6281, "upload_time": "2019-07-14T01:08:31", "url": "https://files.pythonhosted.org/packages/71/d7/2697dd49dcb199270bd10e12c50dfa0973ab9b1f3b0c1b365eb3e540a076/ensemble_pkg-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "7bb2aa31a01f96296c8d7d76149bca7f", "sha256": "662c2e5eb266903d57a92feb825c39a57bb5d3b1d852acf0e23618ae65a41663" }, "downloads": -1, "filename": "ensemble-pkg-0.0.2.tar.gz", "has_sig": false, "md5_digest": "7bb2aa31a01f96296c8d7d76149bca7f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4497, "upload_time": "2019-07-14T01:08:34", "url": "https://files.pythonhosted.org/packages/ad/70/733204fd4d433444e3d327636d9c21a791f7aae7e3751c3168301762e322/ensemble-pkg-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "d40f35b99314a18d515ff3187b8c3f50", "sha256": "3ab18ed96daa2d9aa78ca36f28d26263952a263251160b3f0ccdb2ab1df23bb1" }, "downloads": -1, "filename": "ensemble_pkg-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "d40f35b99314a18d515ff3187b8c3f50", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9659, "upload_time": "2019-08-04T07:19:59", "url": "https://files.pythonhosted.org/packages/1b/3a/9416e9f319919e859d4fc4df2766711b02e43167d501424d51efd1cbbb43/ensemble_pkg-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "dba70b5fd04e82308da0a6e78c65e382", "sha256": "5c8a23bb7ae279177d9d2fd32b0da1e90b61809cfb76e5f08bf844a637fda070" }, "downloads": -1, "filename": "ensemble-pkg-0.0.3.tar.gz", "has_sig": false, "md5_digest": "dba70b5fd04e82308da0a6e78c65e382", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8310, "upload_time": "2019-08-04T07:20:01", "url": "https://files.pythonhosted.org/packages/fc/bd/5bad01c28ac6676161af0b5fa8cc60d1deec72eefe8f44d013d1dbdffd71/ensemble-pkg-0.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d40f35b99314a18d515ff3187b8c3f50", "sha256": "3ab18ed96daa2d9aa78ca36f28d26263952a263251160b3f0ccdb2ab1df23bb1" }, "downloads": -1, "filename": "ensemble_pkg-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "d40f35b99314a18d515ff3187b8c3f50", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9659, "upload_time": "2019-08-04T07:19:59", "url": "https://files.pythonhosted.org/packages/1b/3a/9416e9f319919e859d4fc4df2766711b02e43167d501424d51efd1cbbb43/ensemble_pkg-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "dba70b5fd04e82308da0a6e78c65e382", "sha256": "5c8a23bb7ae279177d9d2fd32b0da1e90b61809cfb76e5f08bf844a637fda070" }, "downloads": -1, "filename": "ensemble-pkg-0.0.3.tar.gz", "has_sig": false, "md5_digest": "dba70b5fd04e82308da0a6e78c65e382", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8310, "upload_time": "2019-08-04T07:20:01", "url": "https://files.pythonhosted.org/packages/fc/bd/5bad01c28ac6676161af0b5fa8cc60d1deec72eefe8f44d013d1dbdffd71/ensemble-pkg-0.0.3.tar.gz" } ] }