{
    "info": {
        "author": "Maxim Podkolzine",
        "author_email": "maxim.podkolzine@gmail.com",
        "bugtrack_url": null,
        "classifiers": [],
        "description": "============================================\nHyper-parameters Tuning for Machine Learning\n============================================\n\n- `Overview <#overview>`__\n    - `About <#about>`__\n    - `Installation <#installation>`__\n    - `How to use <#how-to-use>`__\n- `Features <#features>`__\n    - `Straight-forward specification <#straight-forward-specification>`__\n    - `Exploration-exploitation trade-off <#exploration-exploitation-trade-off>`__\n    - `Learning Curve Estimation <#learning-curve-estimation>`__\n- `Bayesian Optimization <#bayesian-optimization>`__\n\n--------\nOverview\n--------\n\nAbout\n=====\n\n*Hyper-Engine* is a toolbox for `model selection and hyper-parameters tuning <https://en.wikipedia.org/wiki/Hyperparameter_optimization>`__.\nIt aims to provide most state-of-the-art techniques via intuitive API and with minimum dependencies.\n*Hyper-Engine* is **not a framework**, which means it doesn't enforce any structure or design to the main code,\nthus making integration local and non-intrusive.\n\nInstallation\n============\n\n.. code-block:: shell\n\n    pip install git+https://github.com/maxim5/hyper-engine.git@master \n\nDependencies:\n\n-  Six, NumPy, SciPy\n-  TensorFlow (optional)\n-  PyPlot (optional, only for development)\n\nCompatibility:\n\n.. image:: https://travis-ci.org/maxim5/hyper-engine.svg?branch=master\n    :target: https://travis-ci.org/maxim5/hyper-engine\n\n-  Python 2.7, 3.5, 3.6\n\nLicense:\n\n- `Apache 2.0 <LICENSE>`__\n\n*Hyper-Engine* is designed to be ML-platform agnostic, but currently provides only simple `TensorFlow <https://github.com/tensorflow/tensorflow>`__ binding.\n\nHow to use\n==========\n\nAdapting your code to *Hyper-Engine* usually boils down to migrating hard-coded hyper-parameters to a dictionary (or an object)\nand giving names to particular tensors.\n\n**Before:**\n\n.. code-block:: python\n\n    def my_model():\n      x = tf.placeholder(...)\n      y = tf.placeholder(...)\n      ...\n      optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)\n      ...\n\n**After:**\n\n.. code-block:: python\n\n    def my_model(params):\n      x = tf.placeholder(..., name='input')\n      y = tf.placeholder(..., name='label')\n      ...\n      optimizer = tf.train.GradientDescentOptimizer(learning_rate=params['learning_rate'])\n      ...\n\n    # Now can run the model with any set of hyper-parameters\n\n\nThe rest of the integration code is isolated and can be placed in the ``main`` script.\nSee the examples of hyper-parameter tuning in `examples <hyperengine/examples>`__ package.\n\n--------\nFeatures\n--------\n\nStraight-forward specification\n==============================\n\nThe crucial part of hyper-parameter tuning is the definition of a *domain*\nover which the engine is going to optimize the model. Some variables are continuous (e.g., the learning rate),\nsome variables are integer values in a certain range (e.g., the number of hidden units), some variables are categorical\nand represent architecture knobs (e.g., the choice of non-linearity).\n\nYou can define all these variables and their ranges in ``numpy``-like fashion:\n\n.. code-block:: python\n\n    hyper_params_spec = {\n      'optimizer': {\n        'learning_rate': 10**spec.uniform(-3, -1),          # makes the continuous range [0.1, 0.001]\n        'epsilon': 1e-8,                                    # constants work too\n      },\n      'conv': {\n        'filters': [[3, 3, spec.choice(range(32, 48))],     # an integer between [32, 48]\n                    [3, 3, spec.choice(range(64, 96))],     # an integer between [64, 96]\n                    [3, 3, spec.choice(range(128, 192))]],  # an integer between [128, 192]\n        'activation': spec.choice(['relu','prelu','elu']),  # a categorical range: 1 of 3 activations\n        'down_sample': {\n          'size': [2, 2],\n          'pooling': spec.choice(['max_pool', 'avg_pool'])  # a categorical range: 1 of 2 pooling methods\n        },\n        'residual': spec.random_bool(),                     # either True or False\n        'dropout': spec.uniform(0.75, 1.0),                 # a uniform continuous range\n      },\n    }\n\nNote that ``10**spec.uniform(-3, -1)`` is not the same *distribution* as ``spec.uniform(0.001, 0.1)``\n(though they both define the same *range* of values).\nIn the first case, the whole logarithmic spectrum ``(-3, -1)`` is equally probable, while in\nthe second case, small values around ``0.001`` are much less likely than the values around the mean ``0.0495``.\nSpecifying the following domain range for the learning rate - ``spec.uniform(0.001, 0.1)`` - will likely skew the results\ntowards higher learning rates. This outlines the importance of random variable transformations and arithmetic operations.\n\nExploration-exploitation trade-off\n==================================\n\nMachine learning model selection is expensive.\nEach model evaluation requires full training from scratch and may take minutes to hours to days, \ndepending on the problem complexity and available computational resources.\n*Hyper-Engine* provides the algorithm to explore the space of parameters efficiently, focus on the most promising areas,\nthus converge to the maximum as fast as possible.\n\n**Example 1**: the true function is 1-dimensional, ``f(x) = x * sin(x)`` (black curve) on [-10, 10] interval.\nRed dots represent each trial, red curve is the `Gaussian Process <https://en.wikipedia.org/wiki/Gaussian_process>`__ mean,\nblue curve is the mean plus or minus one standard deviation.\nThe optimizer randomly chose the negative mode as more promising.\n\n.. image:: /.images/figure_1.png\n    :width: 80%\n    :alt: 1D Bayesian Optimization\n    :align: center\n\n**Example 2**: the 2-dimensional function ``f(x, y) = (x + y) / ((x - 1) ** 2 - sin(y) + 2)`` (black surface) on [0,9]x[0,9] square.\nRed dots represent each trial, the Gaussian Process mean and standard deviations are not shown for simplicity.\nNote that to achieve the maximum both variables must be picked accurately.\n\n.. image:: /.images/figure_2-1.png\n   :width: 100%\n   :alt: 2D Bayesian Optimization\n   :align: center\n\n.. image:: /.images/figure_2-2.png\n   :width: 100%\n   :alt: 2D Bayesian Optimization\n   :align: center\n\nThe code for these and others examples is `here <https://github.com/maxim5/hyper-engine/blob/master/hyperengine/tests/strategy_test.py>`__.\n\nLearning Curve Estimation\n=========================\n\n*Hyper-Engine* can monitor the model performance during the training and stop early if it's learning too slowly.\nThis is done via *learning curve prediction*. Note that this technique is compatible with Bayesian Optimization, since\nit estimates the model accuracy after full training - this value can be safely used to update Gaussian Process parameters.\n\nExample code:\n\n.. code-block:: python\n\n    curve_params = {\n      'burn_in': 30,                # burn-in period: 30 models \n      'min_input_size': 5,          # start predicting after 5 epochs\n      'value_limit': 0.80,          # stop if the estimate is less than 80% with high probability\n    }\n    curve_predictor = LinearCurvePredictor(**curve_params)\n\nCurrently there is only one implementation of the predictor, ``LinearCurvePredictor``, \nwhich is very efficient, but requires relatively large burn-in period to predict model accuracy without flaws.\n\nNote that learning curves can be reused between different models and works quite well for the burn-in,\nso it's recommended to serialize and load curve data via ``io_save_dir`` and ``io_load_dir`` parameters.\n\nSee also the following paper:\n`Speeding up Automatic Hyperparameter Optimization of Deep Neural Networks\nby Extrapolation of Learning Curves <http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf>`__\n\n---------------------\nBayesian Optimization\n---------------------\n\nImplements the following `methods <https://en.wikipedia.org/wiki/Bayesian_optimization>`__:\n\n-  Probability of improvement (See H. J. Kushner. A new method of locating the maximum of an arbitrary multipeak curve in the presence of noise. J. Basic Engineering, 86:97\u2013106, 1964.)\n-  Expected Improvement (See J. Mockus, V. Tiesis, and A. Zilinskas. Toward Global Optimization, volume 2, chapter The Application of Bayesian Methods for Seeking the Extremum, pages 117\u2013128. Elsevier, 1978)\n-  `Upper Confidence Bound <http://www.jmlr.org/papers/volume3/auer02a/auer02a.pdf>`__\n-  `Mixed / Portfolio strategy <http://mlg.eng.cam.ac.uk/hoffmanm/papers/hoffman:2011.pdf>`__\n-  Naive random search.\n\nPI method prefers exploitation to exploration, UCB is the opposite. One of the best strategies we've seen is a mixed one:\nstart with high probability of UCB and gradually decrease it, increasing PI probability.\n\nDefault kernel function used is `RBF kernel <https://en.wikipedia.org/wiki/Radial_basis_function_kernel>`__, but it is extensible.",
        "description_content_type": null,
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/maxim5/hyper-engine",
        "keywords": "machine learning,hyper-parameters,model selection,bayesian optimization",
        "license": "Apache 2.0",
        "maintainer": "",
        "maintainer_email": "",
        "name": "hyperengine",
        "package_url": "https://pypi.org/project/hyperengine/",
        "platform": "",
        "project_url": "https://pypi.org/project/hyperengine/",
        "project_urls": {
            "Homepage": "https://github.com/maxim5/hyper-engine"
        },
        "release_url": "https://pypi.org/project/hyperengine/0.1.1/",
        "requires_dist": null,
        "requires_python": "",
        "summary": "Python library for Bayesian hyper-parameters optimization",
        "version": "0.1.1"
    },
    "last_serial": 3569838,
    "releases": {
        "0.1.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "05c3de48ece3ffc00c21ac639bdbdfe9",
                    "sha256": "1230857839fcaa94f8781966e2f10b1eb549a30977c7483a2933f315570d141e"
                },
                "downloads": -1,
                "filename": "hyperengine-0.1.1.tar.gz",
                "has_sig": false,
                "md5_digest": "05c3de48ece3ffc00c21ac639bdbdfe9",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 31237,
                "upload_time": "2018-02-10T12:56:52",
                "url": "https://files.pythonhosted.org/packages/d7/de/cc05d99e18ddb74012bf5d5ec8f7932fd5d667a5373c576d26dfad6f598a/hyperengine-0.1.1.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "05c3de48ece3ffc00c21ac639bdbdfe9",
                "sha256": "1230857839fcaa94f8781966e2f10b1eb549a30977c7483a2933f315570d141e"
            },
            "downloads": -1,
            "filename": "hyperengine-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "05c3de48ece3ffc00c21ac639bdbdfe9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 31237,
            "upload_time": "2018-02-10T12:56:52",
            "url": "https://files.pythonhosted.org/packages/d7/de/cc05d99e18ddb74012bf5d5ec8f7932fd5d667a5373c576d26dfad6f598a/hyperengine-0.1.1.tar.gz"
        }
    ]
}