{ "info": { "author": "Ralph Haygood", "author_email": "ralph@ralphhaygood.org", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Topic :: Scientific/Engineering" ], "description": "sklearn-gbmi: scikit-learn gradient-boosting-model interactions\n===============================================================\n\nThis distribution provides a Python module for computing Friedman and Popescu's *H* statistics, in order to look for\ninteractions among variables in scikit-learn gradient-boosting models\n(http://scikit-learn.org/stable/modules/ensemble.html#gradient-tree-boosting).\n\nSee Jerome H. Friedman and Bogdan E. Popescu, 2008, \"Predictive learning via rule ensembles\", *Ann. Appl. Stat.*\n**2**:916-954, http://projecteuclid.org/download/pdfview_1/euclid.aoas/1223908046, s. 8.1.\n\n\nInstallation\n------------\n\n pip install sklearn-gbmi\n\n\nUsage\n-----\n\nGiven a scikit-learn gradient-boosting model `gbm` that has been fitted to a NumPy array or pandas data frame\n`array_or_frame` and a list of indices of columns of the array or columns of the data frame `indices_or_columns`, the\n*H* statistic of the variables represented by the elements of `array_or_frame` and specified by `indices_or_columns` can\nbe computed via\n\n from sklearn_gbmi import *\n\n h(gbm, array_or_frame, indices_or_columns)\n\nAlternatively, the two-variable *H* statistic of each pair of variables represented by the elements of `array_or_frame`\nand specified by `indices_or_columns` can be computed via\n\n from sklearn_gbmi import *\n\n h_all_pairs(gbm, array_or_frame, indices_or_columns)\n\n(Compared to iteratively calling `h`, calling `h_all_pairs` avoids redundant computations.)\n\n`indices_or_columns` is optional, with default value `'all'`. If it is `'all'`, then all columns of `array_or_frame` are\nused.\n\n`NaN` is returned if a computation is spoiled by weak main effects and rounding errors.\n\n*H* varies from 0 to 1. The larger *H*, the stronger the evidence for an interaction among the variables.\n\n\nExample\n-------\n\nSee the Jupyter notebook example.ipynb (https://github.com/ralphhaygood/sklearn-gbmi/blob/master/example.ipynb) for a\ncomplete example of how to use the module.\n\n\nNotes\n-----\n\n1. Per Friedman and Popescu, only variables with strong main effects should be examined for interactions. Strengths of\nmain effects are available as `gbm.feature_importances_` once `gbm` has been fitted.\n\n2. Per Friedman and Popescu, collinearity among variables can lead to interactions in `gbm` that are not present in the\ntarget function. To forestall such spurious interactions, check for strong correlations among variables before fitting\n`gbm`.", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/ralphhaygood/sklearn-gbmi/tarball/1.0.0", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ralphhaygood/sklearn-gbmi", "keywords": "boosted,boosting,data science,Friedman,gradient boosted,gradient boosting,H statistic,interaction,machine learning,Popescu,scikit learn,sklearn", "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "sklearn-gbmi", "package_url": "https://pypi.org/project/sklearn-gbmi/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/sklearn-gbmi/", "project_urls": { "Download": "https://github.com/ralphhaygood/sklearn-gbmi/tarball/1.0.0", "Homepage": "https://github.com/ralphhaygood/sklearn-gbmi" }, "release_url": "https://pypi.org/project/sklearn-gbmi/1.0.0/", "requires_dist": null, "requires_python": null, "summary": "Compute Friedman and Popescu's H statistics, in order to look for interactions among variables in scikit-learn gradient-boosting models.", "version": "1.0.0" }, "last_serial": 2574547, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "5aea27938b5841a60517381759c0befc", "sha256": "7b49aaf248eaef969f6c517c727ea4f8506bbd4b845fb920963a6535a5df0f53" }, "downloads": -1, "filename": "sklearn-gbmi-1.0.0.zip", "has_sig": false, "md5_digest": "5aea27938b5841a60517381759c0befc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11144, "upload_time": "2017-01-14T21:27:04", "url": "https://files.pythonhosted.org/packages/e2/41/c2793c52cd0046efcba1ab41453c64039d47090f2bb5eb75bd123e9ffb74/sklearn-gbmi-1.0.0.zip" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5aea27938b5841a60517381759c0befc", "sha256": "7b49aaf248eaef969f6c517c727ea4f8506bbd4b845fb920963a6535a5df0f53" }, "downloads": -1, "filename": "sklearn-gbmi-1.0.0.zip", "has_sig": false, "md5_digest": "5aea27938b5841a60517381759c0befc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11144, "upload_time": "2017-01-14T21:27:04", "url": "https://files.pythonhosted.org/packages/e2/41/c2793c52cd0046efcba1ab41453c64039d47090f2bb5eb75bd123e9ffb74/sklearn-gbmi-1.0.0.zip" } ] }