{ "info": { "author": "James Bergstra", "author_email": "anon@anon.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Operating System :: MacOS :: MacOS X", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX", "Operating System :: Unix", "Programming Language :: Python", "Topic :: Scientific/Engineering", "Topic :: Software Development" ], "description": "# hyperopt-sklearn\n\n[Hyperopt-sklearn](http://hyperopt.github.com/hyperopt-sklearn/) is\n[Hyperopt](http://hyperopt.github.com/hyperopt)-based model selection among machine learning algorithms in\n[scikit-learn](http://scikit-learn.org/).\n\nSee how to use hyperopt-sklearn through [examples](http://hyperopt.github.io/hyperopt-sklearn/#documentation)\nor older\n[notebooks](http://nbviewer.ipython.org/github/hyperopt/hyperopt-sklearn/tree/master/notebooks)\n\n\n## Installation\n\nInstallation from a git clone using pip is supported:\n\n git clone git@github.com:hyperopt/hyperopt-sklearn.git\n (cd hyperopt-sklearn && pip install -e .)\n\n## Usage\n\nIf you are familiar with sklearn, adding the hyperparameter search with hyperopt-sklearn is only a one line change from the standard pipeline.\n\n```\nfrom hpsklearn import HyperoptEstimator, svc\nfrom sklearn import svm\n\n# Load Data\n# ...\n\nif use_hpsklearn:\n estim = HyperoptEstimator(classifier=svc('mySVC'))\nelse:\n estim = svm.SVC()\n\nestim.fit(X_train, y_train)\n\nprint(estim.score(X_test, y_test))\n# <>\n```\n\nComplete example using the Iris dataset:\n\n```\nfrom hpsklearn import HyperoptEstimator, any_classifier\nfrom sklearn.datasets import load_iris\nfrom hyperopt import tpe\nimport numpy as np\n\n# Download the data and split into training and test sets\n\niris = load_iris()\n\nX = iris.data\ny = iris.target\n\ntest_size = int(0.2 * len(y))\nnp.random.seed(13)\nindices = np.random.permutation(len(X))\nX_train = X[ indices[:-test_size]]\ny_train = y[ indices[:-test_size]]\nX_test = X[ indices[-test_size:]]\ny_test = y[ indices[-test_size:]]\n\n# Instantiate a HyperoptEstimator with the search space and number of evaluations\n\nestim = HyperoptEstimator(classifier=any_classifier('my_clf'),\n preprocessing=any_preprocessing('my_pre'),\n algo=tpe.suggest,\n max_evals=100,\n trial_timeout=120)\n\n# Search the hyperparameter space based on the data\n\nestim.fit( X_train, y_train )\n\n# Show the results\n\nprint( estim.score( X_test, y_test ) )\n# 1.0\n\nprint( estim.best_model() )\n# {'learner': ExtraTreesClassifier(bootstrap=False, class_weight=None, criterion='gini',\n# max_depth=3, max_features='log2', max_leaf_nodes=None,\n# min_impurity_decrease=0.0, min_impurity_split=None,\n# min_samples_leaf=1, min_samples_split=2,\n# min_weight_fraction_leaf=0.0, n_estimators=13, n_jobs=1,\n# oob_score=False, random_state=1, verbose=False,\n# warm_start=False), 'preprocs': (), 'ex_preprocs': ()}\n```\n\nHere's an example using MNIST and being more specific on the classifier and preprocessing.\n\n```\nfrom hpsklearn import HyperoptEstimator, extra_trees\nfrom sklearn.datasets import fetch_mldata\nfrom hyperopt import tpe\nimport numpy as np\n\n# Download the data and split into training and test sets\n\ndigits = fetch_mldata('MNIST original')\n\nX = digits.data\ny = digits.target\n\ntest_size = int(0.2 * len(y))\nnp.random.seed(13)\nindices = np.random.permutation(len(X))\nX_train = X[ indices[:-test_size]]\ny_train = y[ indices[:-test_size]]\nX_test = X[ indices[-test_size:]]\ny_test = y[ indices[-test_size:]]\n\n# Instantiate a HyperoptEstimator with the search space and number of evaluations\n\nestim = HyperoptEstimator(classifier=extra_trees('my_clf'),\n preprocessing=[],\n algo=tpe.suggest,\n max_evals=10,\n trial_timeout=300)\n\n# Search the hyperparameter space based on the data\n\nestim.fit( X_train, y_train )\n\n# Show the results\n\nprint( estim.score( X_test, y_test ) )\n# 0.962785714286 \n\nprint( estim.best_model() )\n# {'learner': ExtraTreesClassifier(bootstrap=True, class_weight=None, criterion='entropy',\n# max_depth=None, max_features=0.959202875857,\n# max_leaf_nodes=None, min_impurity_decrease=0.0,\n# min_impurity_split=None, min_samples_leaf=1,\n# min_samples_split=2, min_weight_fraction_leaf=0.0,\n# n_estimators=20, n_jobs=1, oob_score=False, random_state=3,\n# verbose=False, warm_start=False), 'preprocs': (), 'ex_preprocs': ()}\n```\n\n## Available Components\n\nNot all of the classifiers/regressors/preprocessing from sklearn have been implemented yet.\nA list of those currently available is shown below.\nIf there is something you would like that is not on the list, feel free to make an issue or a pull request!\nThe source code for implementing these functions is found [here](https://github.com/hyperopt/hyperopt-sklearn/blob/master/hpsklearn/components.py)\n\n### Classifiers\n\n```\nsvc\nsvc_linear\nsvc_rbf\nsvc_poly\nsvc_sigmoid\nliblinear_svc\n\nknn\n\nada_boost\ngradient_boosting\n\nrandom_forest\nextra_trees\ndecision_tree\n\nsgd\n\nxgboost_classification\n\nmultinomial_nb\ngaussian_nb\n\npassive_aggressive\n\nlinear_discriminant_analysis\nquadratic_discriminant_analysis\n\nrbm\n\ncolkmeans\n\none_vs_rest\none_vs_one\noutput_code\n\n```\n\nFor a simple generic search space across many classifiers, use `any_classifier`. If your data is in a sparse matrix format, use `any_sparse_classifier`.\n\n### Regressors\n\n```\nsvr\nsvr_linear\nsvr_rbf\nsvr_poly\nsvr_sigmoid\n\nknn_regression\n\nada_boost_regression\ngradient_boosting_regression\n\nrandom_forest_regression\nextra_trees_regression\n\nsgd_regression\n\nxgboost_regression\n```\n\nFor a simple generic search space across many regressors, use `any_regressor`. If your data is in a sparse matrix format, use `any_sparse_regressor`.\n\n### Preprocessing\n\n```\npca\n\none_hot_encoder\n\nstandard_scaler\nmin_max_scaler\nnormalizer\n\nts_lagselector\n\ntfidf\n\n```\n\nFor a simple generic search space across many preprocessing algorithms, use `any_preprocessing`.\nIf you are working with raw text data, use `any_text_preprocessing`.\nCurrently only TFIDF is used for text, but more may be added in the future.\nNote that the `preprocessing` parameter in `HyperoptEstimator` is expecting a list, since various preprocessing steps can be chained together.\nThe generic search space functions `any_preprocessing` and `any_text_preprocessing` already return a list, but the others do not so they should be wrapped in a list.\nIf you do not want to do any preprocessing, pass in an empty list `[]`.", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/hyperopt/hyperopt-sklearn/archive/0.1.0.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://hyperopt.github.com/hyperopt-sklearn/", "keywords": "hyperopt,hyperparameter,sklearn", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "hpsklearn", "package_url": "https://pypi.org/project/hpsklearn/", "platform": "Linux", "project_url": "https://pypi.org/project/hpsklearn/", "project_urls": { "Download": "https://github.com/hyperopt/hyperopt-sklearn/archive/0.1.0.tar.gz", "Homepage": "http://hyperopt.github.com/hyperopt-sklearn/" }, "release_url": "https://pypi.org/project/hpsklearn/0.1.0/", "requires_dist": null, "requires_python": "", "summary": "Hyperparameter Optimization for sklearn", "version": "0.1.0" }, "last_serial": 5294711, "releases": { "0.0.3": [ { "comment_text": "", "digests": { "md5": "137b66bc8706eaafcc995237a4a07668", "sha256": "e530ed3f985f4eb4fd84fe29a6aaf4e8af03b7ff0584a30f7438f252faeaac84" }, "downloads": -1, "filename": "hpsklearn-0.0.3.tar.gz", "has_sig": false, "md5_digest": "137b66bc8706eaafcc995237a4a07668", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23544, "upload_time": "2019-05-21T00:18:23", "url": "https://files.pythonhosted.org/packages/81/16/60e9027bd8ae766188b8a7aebe4dffb667432147b36c49792f2ff0715759/hpsklearn-0.0.3.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "839ae12cc13d7d49ca511622c56bcc1d", "sha256": "3a320062eb3296b35cccac87ad8b28b95dbc5f1fad22dc81e911f26d5b491cb7" }, "downloads": -1, "filename": "hpsklearn-0.1.0.tar.gz", "has_sig": false, "md5_digest": "839ae12cc13d7d49ca511622c56bcc1d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26470, "upload_time": "2017-12-03T04:09:17", "url": "https://files.pythonhosted.org/packages/ce/cb/61b99f73621e2692abd0e730f7888a9983d01f626868336fa1db1d57bc1e/hpsklearn-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "839ae12cc13d7d49ca511622c56bcc1d", "sha256": "3a320062eb3296b35cccac87ad8b28b95dbc5f1fad22dc81e911f26d5b491cb7" }, "downloads": -1, "filename": "hpsklearn-0.1.0.tar.gz", "has_sig": false, "md5_digest": "839ae12cc13d7d49ca511622c56bcc1d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26470, "upload_time": "2017-12-03T04:09:17", "url": "https://files.pythonhosted.org/packages/ce/cb/61b99f73621e2692abd0e730f7888a9983d01f626868336fa1db1d57bc1e/hpsklearn-0.1.0.tar.gz" } ] }