{ "info": { "author": "Andres Arrieche", "author_email": "andres.arrieche.7@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: Apache Software License", "Programming Language :: Python" ], "description": "scikit-learn-helper\n============\n\nscikit-learn-helper is a light library with the purpose of providing utility functions that makes working with scikit-learn even easier, by letting us to focus on the solving the probling instead of writting boilerplate code\n\n### Installation\n\n#### Dependencies\nscikit-learn-helper requires:\n\n scikit-learn (>= 0.20.2) Of course :)\n\n#### User installation\n\n\n pip install scikit-learn-helper\n\n\n#### Source code\n\n\n https://github.com/aras7/scikit-learn-helper\n\n### Examples\n\n#### How to use it?\n\n```python\nfrom sklearn_helper.model.evaluator import Evaluator\nmodels = {\n \"DummyRegressor\": {\n \"model\": DummyRegressor()\n }\n}\nevaluator = Evaluator(models)\n\ndataset = datasets.load_boston()\nX, y = dataset.data, dataset.target\nmodel = evaluator.evaluate(X, y)\n```\n\n###### Output\n\n```\nModel: DummyRegressor | cleaner:DummyCleaner | mean_squared_error:111.1508 |Time:0.00 sec\n```\n\n#### How to use other another metric for evaluation?\n```\nfrom sklearn_helper.model.evaluator import Evaluator\nfrom sklearn.metrics import r2_score\nmodels = {\n \"DummyRegressor\": {\n \"model\": DummyRegressor()\n }\n}\nevaluator = Evaluator(models, main_metric=r2_score)\n\ndataset = datasets.load_boston()\nX, y = dataset.data, dataset.target\nmodel = evaluator.evaluate(X, y)\n```\n\n###### Output\n```\nModel: DummyRegressor | cleaner:DummyCleaner | r2_score:-0.6891 |Time:0.00 sec\n```\n\n#### How to compare 2+ models?\n\n```python\nfrom sklearn_helper.model.evaluator import Evaluator\nmodels = {\n \"DummyRegressor\": {\n \"model\": DummyRegressor()\n }, \n \"LinearRegression\": {\n \"model\": linear_model.LinearRegression()\n }\n}\nevaluator = Evaluator(models)\n\ndataset = datasets.load_boston()\nX, y = dataset.data, dataset.target\nmodel = evaluator.evaluate(X, y)\n\n```\n\n###### Output\n\n```\nModel: DummyRegressor | cleaner:DummyCleaner | mean_squared_error:111.1508 |Time:0.00 sec\nModel: LinearRegression | cleaner:DummyCleaner | mean_squared_error:169.0083 |Time:0.00 sec\n```\n\n#### How to maximize instead of minimize ?\n\n```python\nfrom sklearn.metrics import accuracy_score\nfrom sklearn_helper.model.evaluator import Evaluator\n\nmodels = {\n \"DummyClassifier\": {\n \"model\": DummyClassifier()\n },\n \"SVC\": {\n \"model\": svm.SVC()\n }\n}\nevaluator = Evaluator(models, main_metric=accuracy_score, maximize_metric=True)\n\ndigits = datasets.load_digits()\nX, y = digits.data, digits.target\nmodel = evaluator.evaluate(X, y)\n```\n\n###### Output\n\n```\nModel: SVC | cleaner:DummyCleaner | accuracy_score:0.4179 |Time:0.89 sec\nModel: DummyClassifier | cleaner:DummyCleaner | accuracy_score:0.0896 |Time:0.00 sec\n```\n\n#### How to compare and tune models?\n```python\nfrom sklearn.metrics import accuracy_score\nfrom sklearn_helper.model.evaluator import Evaluator\n\nmodels = {\n \"SVC\": {\n \"model\": svm.SVC()\n },\n \"Improved SVC\": {\n \"model\": svm.SVC(),\n \"params\": {\n \"gamma\": np.linspace(0, 0.1, num=10),\n \"C\": range(1, 10)\n }\n }\n}\nevaluator = Evaluator(models, main_metric=accuracy_score, maximize_metric=True)\n\ndigits = datasets.load_digits()\nX, y = digits.data, digits.target\nmodel = evaluator.evaluate(X, y)\nprint(model)\n```\n\n###### Output\n```\nModel: Improved SVC | cleaner:DummyCleaner | accuracy_score:0.6511 |Time:0.86 sec\nModel: SVC | cleaner:DummyCleaner | accuracy_score:0.4179 |Time:0.89 sec\nSVC(C=2, cache_size=200, class_weight=None, coef0=0.0,\n decision_function_shape='ovr', degree=3, gamma=0.011111111111111112,\n kernel='rbf', max_iter=-1, probability=False, random_state=None,\n shrinking=True, tol=0.001, verbose=False)\n```\n\n\n#### How to get multiple metric?\n```\nfrom sklearn.metrics import roc_auc_score, f1_score, accuracy_score\nfrom sklearn_helper.model.evaluator import Evaluator\nmodels = {\n \"DummyClassifier\": {\n \"model\": DummyClassifier()\n },\n \"SVC\": {\n \"model\": svm.SVC()\n }\n}\nevaluator = Evaluator(models, main_metric=roc_auc_score, \n additional_metrics=[f1_score, accuracy_score], \n maximize_metric=True)\ndigits = datasets.load_breast_cancer()\nX, y = digits.data, digits.target\nmodel = evaluator.evaluate(X, y)\n```\n\n###### Output\n```\nModel: SVC | cleaner:DummyCleaner | roc_auc_score:0.5000 |f1_score:0.7650 |accuracy_score:0.6277 |Time:0.11 sec\nModel: DummyClassifier | cleaner:DummyCleaner | roc_auc_score:0.4961 |f1_score:0.6063 |accuracy_score:0.5308 |Time:0.01 sec\n```\n\n#### Can I compare data engineering process? yes :)\n```\nfrom sklearn.metrics import accuracy_score\nfrom sklearn_helper.model.evaluator import Evaluator\nfrom sklearn_helper.data.DataCleaner import DataCleaner\nfrom sklearn_helper.data.DummyCleaner import DummyCleaner # No transformation is applied\n\nclass Thresholding(DataCleaner):\n THRESHOLD = 3\n def clean_training_data(self, x, y):\n return self.clean_x(x), y\n def clean_testing_data(self, x):\n _x = np.copy(x)\n _x[_x <= self.THRESHOLD] = 0\n _x[_x > self.THRESHOLD] = 1\n return _x\n\nmodels = {\n \"DummyClassifier\": {\n \"model\": DummyClassifier()\n },\n \"SVC\": {\n \"model\": svm.SVC(C=2, gamma=0.0111)\n }\n}\nevaluator = Evaluator(models, data_cleaners=[Thresholding(), DummyCleaner()], main_metric=accuracy_score)\n\ndigits = datasets.load_digits()\nX, y = digits.data, digits.target\nmodel = evaluator.evaluate(X, y)\n```\n\n###### Output\n```\nModel: DummyClassifier | cleaner:DummyCleaner | accuracy_score:0.0940 |Time:0.00 sec\nModel: DummyClassifier | cleaner:Thresholding | accuracy_score:0.1035 |Time:0.00 sec\nModel: SVC | cleaner:DummyCleaner | accuracy_score:0.6516 |Time:0.87 sec\nModel: SVC | cleaner:Thresholding | accuracy_score:0.8848 |Time:0.28 sec\n```\n\n#### For more examples refer to:\n\n https://github.com/aras7/scikit-learn-helper/tree/master/examples\n\n\n### ToDo:\n\n* Add unit tests\n* Add prediction time to printed results\n* Improve documentation\n * Splitter\n* Add functionality to test different dataset as an alternative to `from sklearn.model_selection import cross_val_score`. It might be useful when resampling inside DataCleaners\n* Add codecov.io suport\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "https://pypi.org/project/scikit-learn-helper/#files", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "Apache", "maintainer": "", "maintainer_email": "", "name": "scikit-learn-helper", "package_url": "https://pypi.org/project/scikit-learn-helper/", "platform": "", "project_url": "https://pypi.org/project/scikit-learn-helper/", "project_urls": { "Download": "https://pypi.org/project/scikit-learn-helper/#files" }, "release_url": "https://pypi.org/project/scikit-learn-helper/0.0.10/", "requires_dist": [ "scikit-learn" ], "requires_python": "", "summary": "Helper code to make easier working with sklearn. https://github.com/aras7/scikit-learn-helper", "version": "0.0.10" }, "last_serial": 4949314, "releases": { "0.0.0.1": [ { "comment_text": "", "digests": { "md5": "cb6b6e3d0c7e97b9475637c0fc52d97a", "sha256": "8bb824a902d4c63b9a0e498c365310103b6ea82d044faa031bae0500e53c4acb" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.0.1.tar.gz", "has_sig": false, "md5_digest": "cb6b6e3d0c7e97b9475637c0fc52d97a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 996, "upload_time": "2019-03-07T03:30:17", "url": "https://files.pythonhosted.org/packages/c1/b8/4ad2361100aa8fa2acb5dbd1fb51c81e54ecc2fa593048da212b95f9312e/scikit_learn_helper-0.0.0.1.tar.gz" } ], "0.0.1": [ { "comment_text": "", "digests": { "md5": "e013ebe11e2132fffdefade27fc615f1", "sha256": "8a2568e7eb6c6ca228c8b7a6df1b5f39be0eb17e24e41cfad516eda5effdf74e" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.1.tar.gz", "has_sig": false, "md5_digest": "e013ebe11e2132fffdefade27fc615f1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 987, "upload_time": "2019-03-07T03:03:53", "url": "https://files.pythonhosted.org/packages/7a/9f/448898d2b0f7fd57ef3164a2648feab6d638342ad9b20a343c227cc3f861/scikit_learn_helper-0.0.1.tar.gz" } ], "0.0.10": [ { "comment_text": "", "digests": { "md5": "258dc0017755009850f7d1decf60627f", "sha256": "2b89a60003782370c44da81a5b4fd1636eda8144bdbdeb98522b05785754b8db" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.10-py3-none-any.whl", "has_sig": false, "md5_digest": "258dc0017755009850f7d1decf60627f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 22805, "upload_time": "2019-03-17T02:07:24", "url": "https://files.pythonhosted.org/packages/6d/ac/b4da3f513f6a901b51f5d45a633e9814682f20d26abae5e3b327757ff5c5/scikit_learn_helper-0.0.10-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a3bab50ecd52c4bf4946312ce8ff24ed", "sha256": "9dd9bc261c32b697223d1db128cae555e13b00bbedd687d4b37e30f35bc86a77" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.10.tar.gz", "has_sig": false, "md5_digest": "a3bab50ecd52c4bf4946312ce8ff24ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9628, "upload_time": "2019-03-17T02:07:25", "url": "https://files.pythonhosted.org/packages/c6/5c/aed45386892dcd428198c686b1619d0bbb9b7f215ee579787dd70cc4dd04/scikit_learn_helper-0.0.10.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "28bbbd4de040a646541c6b2cd55492b7", "sha256": "7dec6e41dad8fd4ed26dde01b08b66c07ddea90daa12aab24cc4e9004df6da57" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.2.tar.gz", "has_sig": false, "md5_digest": "28bbbd4de040a646541c6b2cd55492b7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1014, "upload_time": "2019-03-07T03:38:23", "url": "https://files.pythonhosted.org/packages/7f/0a/2c343e674ccc79b0227f24a21834c44b8d3ebc96c2479ab589d6f5798db2/scikit_learn_helper-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "e76a1f2e2a404cac7ed7e66b7df7b3d0", "sha256": "abae7881bbc363e8956346c0d49c7c0ae8035dfcfa23d11ac174fa402190a5dc" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.3.tar.gz", "has_sig": false, "md5_digest": "e76a1f2e2a404cac7ed7e66b7df7b3d0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1008, "upload_time": "2019-03-07T03:45:04", "url": "https://files.pythonhosted.org/packages/c2/71/d352aa8e7cf386b5982a14c79342a9a7f43774f92afb37015b246cae9186/scikit_learn_helper-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "6c0e0b40645e3e6b92f2fc3a86ecae9a", "sha256": "5f176a603d0a77025300d0d88ec3bffcc60856dd081e8d7d0ff7c085e3edb3ce" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.4.tar.gz", "has_sig": false, "md5_digest": "6c0e0b40645e3e6b92f2fc3a86ecae9a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 973, "upload_time": "2019-03-07T03:48:00", "url": "https://files.pythonhosted.org/packages/5a/d3/8f4ce024dca9753cc2bf4c6ec0a98882216ae7b296f9374797e59967bacd/scikit_learn_helper-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "56429ed0a5af8aa25cd9e9fcf0bebd51", "sha256": "09d775e4804c75ba15571c68206e8648eb1afef98a9143450f1265d1d8315057" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.5.tar.gz", "has_sig": false, "md5_digest": "56429ed0a5af8aa25cd9e9fcf0bebd51", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 982, "upload_time": "2019-03-07T03:55:04", "url": "https://files.pythonhosted.org/packages/d1/55/c76d5a331677ba237be799082a5f9f1a3203b39aba15690e9a8fc5dfb7b3/scikit_learn_helper-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "46f2f8fe3c10f045ebf6e43bc0c99992", "sha256": "7648dd99088855cb64c1d4ceeaa91df1def6bdebcb455fed93332e05759877bd" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.6.tar.gz", "has_sig": false, "md5_digest": "46f2f8fe3c10f045ebf6e43bc0c99992", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 976, "upload_time": "2019-03-07T04:03:35", "url": "https://files.pythonhosted.org/packages/64/83/e2be3f40dec169a16a2708bbf7ead308674c28ae288dcd6a775d589e4aa9/scikit_learn_helper-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "307793c685471f4bbd7c212db22ad4c6", "sha256": "bceb8dcf4a1f5d177861cce4de0c07bd8c33be7177c6fef37be8428fc32ce8e9" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "307793c685471f4bbd7c212db22ad4c6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 1272, "upload_time": "2019-03-07T04:11:03", "url": "https://files.pythonhosted.org/packages/58/d9/6d0814806d70a42df9ffd4fe421d74f9877ce8323396748551f74bfdc917/scikit_learn_helper-0.0.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "30d010f64a4bdf7a03bf72b242c34c6a", "sha256": "ee6795a04495a2833a6209d014aba6fb16cafb0136b2ec5687273bae5dda4570" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.7.tar.gz", "has_sig": false, "md5_digest": "30d010f64a4bdf7a03bf72b242c34c6a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 971, "upload_time": "2019-03-07T04:11:05", "url": "https://files.pythonhosted.org/packages/b1/37/4a960c84f2a45972b5ec71b7e45fa28da3ba220f60f8b7d3b0417b0de0a0/scikit_learn_helper-0.0.7.tar.gz" } ], "0.0.8": [ { "comment_text": "", "digests": { "md5": "99889095966a297f26c8de5132b433b2", "sha256": "a5c8685c77706d379eaa823d2a6dacac6e01fbb18399c6f8089046fa06b379f6" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.8-py3-none-any.whl", "has_sig": false, "md5_digest": "99889095966a297f26c8de5132b433b2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10889, "upload_time": "2019-03-07T04:15:17", "url": "https://files.pythonhosted.org/packages/a2/6b/c97900204d533d2346e8b299239d2f957af84f0d045eb87aa2d0de894ea0/scikit_learn_helper-0.0.8-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "58567752919c176ac4f48cecf7701343", "sha256": "12c946322f1ff01406cb618668d380715366bddec6398dd36876d9a442198bc4" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.8.tar.gz", "has_sig": false, "md5_digest": "58567752919c176ac4f48cecf7701343", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5392, "upload_time": "2019-03-07T04:15:19", "url": "https://files.pythonhosted.org/packages/45/62/b28001825f0c329d8b351fe577e268d27e068a9374aff2fcd5067f0c79ea/scikit_learn_helper-0.0.8.tar.gz" } ], "0.0.9": [ { "comment_text": "", "digests": { "md5": "8c868ea7e70e663d90939fca7a67ae40", "sha256": "29ceb3aff64396aad9f74c848bde28d862af35fac0bb5de1c2d8bd791443571c" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.9-py3-none-any.whl", "has_sig": false, "md5_digest": "8c868ea7e70e663d90939fca7a67ae40", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 13921, "upload_time": "2019-03-16T01:07:16", "url": "https://files.pythonhosted.org/packages/be/1c/838033396f03defbc0d36587636dff61d98bbc84089535b62abb54686a42/scikit_learn_helper-0.0.9-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5f472d82c79b46b757c3cfc4e23605af", "sha256": "b0ce426c07880709bd529505b2829322ba5f103fbc3d22ab9975a56b0588ddae" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.9.tar.gz", "has_sig": false, "md5_digest": "5f472d82c79b46b757c3cfc4e23605af", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6757, "upload_time": "2019-03-16T01:07:17", "url": "https://files.pythonhosted.org/packages/95/fd/901c2be05f0d293b3735e59c8f06dee4b20358991736d1a48106c189e075/scikit_learn_helper-0.0.9.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "258dc0017755009850f7d1decf60627f", "sha256": "2b89a60003782370c44da81a5b4fd1636eda8144bdbdeb98522b05785754b8db" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.10-py3-none-any.whl", "has_sig": false, "md5_digest": "258dc0017755009850f7d1decf60627f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 22805, "upload_time": "2019-03-17T02:07:24", "url": "https://files.pythonhosted.org/packages/6d/ac/b4da3f513f6a901b51f5d45a633e9814682f20d26abae5e3b327757ff5c5/scikit_learn_helper-0.0.10-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a3bab50ecd52c4bf4946312ce8ff24ed", "sha256": "9dd9bc261c32b697223d1db128cae555e13b00bbedd687d4b37e30f35bc86a77" }, "downloads": -1, "filename": "scikit_learn_helper-0.0.10.tar.gz", "has_sig": false, "md5_digest": "a3bab50ecd52c4bf4946312ce8ff24ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9628, "upload_time": "2019-03-17T02:07:25", "url": "https://files.pythonhosted.org/packages/c6/5c/aed45386892dcd428198c686b1619d0bbb9b7f215ee579787dd70cc4dd04/scikit_learn_helper-0.0.10.tar.gz" } ] }