{ "info": { "author": "Chapman Siu", "author_email": "chpmn.siu@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# TreeGrad\n\n`TreeGrad` implements a naive approach to converting a Gradient Boosted Tree Model to an Online trainable model. It does this by creating differentiable tree models which can be learned via auto-differentiable frameworks. `TreeGrad` is in essence an implementation of Kontschieder, Peter, et al. \"Deep neural decision forests.\" with extensions.\n\nTo install\n\n```\npython setup.py install\n```\n\nTo do: put this on `pypi`\n\nPlease cite \n\n```\n@article{siu2019treegrad,\n title={TreeGrad: Transferring Tree Ensembles to Neural Networks},\n author={Siu, Chapman},\n journal={arXiv preprint arXiv:submit/2665625},\n year={2019}\n}\n```\n\n\n# Usage\n\n```py\nfrom sklearn.\nimport treegrad as tgd\n\nmod = tgd.TGDClassifier(num_leaves=31, max_depth=-1, learning_rate=0.1, n_estimators=100, autograd_config={'refit_splits':False})\nmod.fit(X, y)\nmod.partial_fit(X, y)\n```\n\n# Requirments\n\nThe requirements for this package are:\n\n* lightgbm\n* scikit-learn\n* autograd\n\nFuture plans:\n\n* Add implementation for Neural Architecture search for decision boundary splits (requires a bit of clean up - TBA)\n * Implementation can be done quite trivially using objects residing in `tree_utils.py` - Challenge is getting this working in a sane manner with `scikit-learn` interface.\n* gpu enabled auto differentiation framework\n* support xgboost/lightgbm additional features such as monotone constraints\n* Support `RegressorMixin`\n\n# Results\n\nWhen decision splits are reset and subsequently re-learned, TreeGrad can be competitive in performance with popular implementations (albeit an order of magnitude slower). Below is a table showing accuracy on test dataset on UCI benchmark datasets for Boosted Ensemble models (100 trees)\n\n\n| Dataset | TreeGrad | LightGBM | Scikit-Learn (Gradient Boosting Classifier) |\n| ---------| --------- | --------- | ------------------------------------------- |\n| adult | 0.860 | 0.873 | **0.874** |\n| covtype | 0.832 | **0.835** | 0.826 |\n| dna | **0.950** | 0.949 | 0.946 |\n| glass | 0.766 | **0.813** | 0.719 |\n| mandelon | **0.882** | 0.881 | 0.866 |\n| soybean | **0.936** | **0.936** | 0.917 |\n| yeast | **0.591** | 0.573 | 0.542 |\n\n\n# Implementation\n\n\n\nTo understand the implementation of `TreeGrad`, we interpret a decision tree algorithm to be a three layer neural network, where the layers are as follows:\n\n1. Node layer, which determines the decision boundaries\n2. Routing layer, which determines which nodes are used to route to the final leaf nodes\n3. Leaf layer, the layer which determines the final predictions\n\nIn the node layer, the decision boundaries can be interpreted as _axis-parallel_ decision boundaries from your typical Linear Classifier; i.e. a fully connected dense layer\n\nThe routing layer requires a binary routing matrix to which essentially the global product routing is applied\n\nThe leaf layer is your typical fully connected dense layer.\n\nThis approach is the same as the one taken by Kontschieder, Peter, et al. \"Deep neural decision forests.\"", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/chappers/TreeGrad", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "treegrad", "package_url": "https://pypi.org/project/treegrad/", "platform": "", "project_url": "https://pypi.org/project/treegrad/", "project_urls": { "Homepage": "http://github.com/chappers/TreeGrad" }, "release_url": "https://pypi.org/project/treegrad/1.0.0/", "requires_dist": null, "requires_python": ">=3.5", "summary": "transfer parameters from lightgbm to differentiable decision trees!", "version": "1.0.0" }, "last_serial": 5185827, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "5709fd528f6af855f81a8562f3a287a8", "sha256": "0ec91349f6c411e8516e27775f02929d0b15feee63184d10e9944a968d33ef21" }, "downloads": -1, "filename": "treegrad-1.0.0.tar.gz", "has_sig": false, "md5_digest": "5709fd528f6af855f81a8562f3a287a8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 12301, "upload_time": "2019-04-25T02:57:21", "url": "https://files.pythonhosted.org/packages/18/aa/bbca431e01080457213a2d8960cc781b4fa7b91fb8aa9b4a5c36bcb072a0/treegrad-1.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5709fd528f6af855f81a8562f3a287a8", "sha256": "0ec91349f6c411e8516e27775f02929d0b15feee63184d10e9944a968d33ef21" }, "downloads": -1, "filename": "treegrad-1.0.0.tar.gz", "has_sig": false, "md5_digest": "5709fd528f6af855f81a8562f3a287a8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 12301, "upload_time": "2019-04-25T02:57:21", "url": "https://files.pythonhosted.org/packages/18/aa/bbca431e01080457213a2d8960cc781b4fa7b91fb8aa9b4a5c36bcb072a0/treegrad-1.0.0.tar.gz" } ] }