{
    "info": {
        "author": "Christopher Shymansky",
        "author_email": "CMShymansky@gmail.com",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 3 - Alpha",
            "License :: OSI Approved :: Apache Software License",
            "Topic :: Utilities"
        ],
        "description": "# What\nPyplearnr is a tool designed to perform model selection, hyperparameter tuning, and model validation via nested k-fold cross-validation in a reproducible way.\n\n# Why\nI found GridSearchCV to be lacking. I wanted a tool that used a similar procedure to perform simultaneous hyperparameter tuning AND model selection with a clear input that summarizes exactly what scikit-learn pipeline steps and parameter combinations will used and whose results allow perfect reproducibility. So, I made my own.\n\n# How\n### Use\nSee the [demo](https://nbviewer.jupyter.org/github/JaggedParadigm/pyplearnr/blob/master/pyplearnr_demo.ipynb) for more detailed use of pyplearnr with actual data.\n\nHere are the basic steps:\n#### 1) Place feature data into non-null feature matrix and target vector\n#### 2) Initialize the nested k-fold cross-validation object\n```python\nkfcv = ppl.NestedKFoldCrossValidation(outer_loop_fold_count=5, \n                                      inner_loop_fold_count=5)\n```\n#### 3) Specify the combinatorial pipeline schematic detailing all possible model/parameter combinations \n\nEx: Here's an example of model/parameter combinations of optional scaling of two types, a principal component analysis directly using scikit-learn's sklearn.decomposition.PCA transformer, selection of data transformed by k principal components (between 1 and 30), and the use of either a k-nearest neighbors classifier (k between 1 and 30) or random forest classifier with a maximum depth between 2 and 5 (and a specified random state for reproducibility).\n\n```python\npipeline_schematic = [\n    {'scaler': {\n            'none': {},\n            'min_max': {},\n            'standard': {}\n        }\n    },\n    {'transform': {\n            'pca': {\n                'sklo': sklearn.decomposition.PCA,\n                'n_components': [feature_count]\n            }\n        }         \n    },\n    {'feature_selection': {\n            'select_k_best': {\n                'k': range(1, feature_count+1)\n            }\n        }\n    },\n    {'estimator': {\n            'knn': {\n                'n_neighbors': range(1,31)\n            },\n            'random_forest': {\n                'sklo': RandomForestClassifier,\n                'max_depth': range(2,6),\n                'random_state': [57]\n\t\t\t}\n        }\n    }\n]\n```\n\n#### 4) Run pyplearnr\n```python\n# Perform nested k-fold cross-validation\nkfcv.fit(X, y, pipeline_schematic=pipeline_schematic, \n         scoring_metric='auc', score_type='median')\n```\n### Methodology\nThe core model selection and validation method is nested k-fold cross-validation (stratified if for classification). Inner-fold contests are used for model selection and outer-folds are used to cross-validate the final winning model. \n\nHere's the basic algorithm used by pyplearnr:\n\n- 1) Pyplearnr shuffles and divides the data into k validation outer-folds. \n- 2) For each outer-fold:\n\t- a) The remaining folds are combined to form the corresponding training set\n\t- b)  This training set is divided into k (or possibly a different number) of inner-test-folds.\n\t- c) For each inner-test-fold:\n\t  - i) The remaining inner-test-folds are combined and used to train all pipelines/models, which are scored on the corresponding inner-test-fold\n  - d) The winning model/pipeline of each inner-test-fold contest is chosen as that with the best median score over all inner-test-folds\n\t  - iii) The user is alerted If there is a tie and expected to decide the winning pipeline (usually the simplest for better generalizability)\n- 4) The final winning model/pipeline is chosen as that with the most number of wins from all inner-test-fold contests corresponding to each outer-fold \n\t- e) Again, the user is expected to decide the winner If there is a tie\n- 5) This final winning model/pipeline is trained on all of the training data for each outer-fold, tested on the corresponding validation set, and summary statistics are presented to the user representing expected out-of-sample performance.\n\n\n### Installation\n##### Dependencies\n\npyplearnr requires:\n\nPython (>= 2.7 or >= 3.3)\nscikit-learn (>= 0.18.2)\nnumpy (>= 1.13.0)\nscipy (>= 0.19.1)\npandas (>= 0.20.2)\nmatplotlib (>= 2.0.2)\n\nFor use in Jupyter notebooks and the conda installation, I recommend having nb_conda (>= 2.2.0).\n\n### User installation\nInstall by using pip:\n\n```\npip install pyplearnr\n```\n\nFor conda, you can issue the same command above within a conda environment or you can include this in your environment.yml file:\n\n```\n- pip:\n    - pyplearnr\n```\n\nand then either generate a new environment from the terminal using:\n\n```\nconda env create\n```\n\nor update an existing one (environment_name) using:\n\n```\nconda env update -n=environment_name -f=./environment.yml\n```\n\nAnother option is to simply clone the respository, link to the location in your code, and import it. \n\n\n\n\n",
        "description_content_type": null,
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "http://packages.python.org/pyplearnr",
        "keywords": "scikit-learn pipeline k-fold cross-validation model selection",
        "license": "OSI Approved :: Apache Software License",
        "maintainer": "",
        "maintainer_email": "",
        "name": "pyplearnr",
        "package_url": "https://pypi.org/project/pyplearnr/",
        "platform": "",
        "project_url": "https://pypi.org/project/pyplearnr/",
        "project_urls": {
            "Homepage": "http://packages.python.org/pyplearnr"
        },
        "release_url": "https://pypi.org/project/pyplearnr/1.0.11.1/",
        "requires_dist": [
            "matplotlib",
            "numpy",
            "pandas",
            "sklearn"
        ],
        "requires_python": "",
        "summary": "Pyplearnr is a tool designed to easily and more elegantly build, validate (nested k-fold cross-validation), and test scikit-learn pipelines.",
        "version": "1.0.11.1"
    },
    "last_serial": 3073867,
    "releases": {
        "1.0.10": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "124107649a8703b0ae1ea5a9067fed8e",
                    "sha256": "e3bf711db1875e196b0164703257f84eb4032203a96dab25f4808b562ef48993"
                },
                "downloads": -1,
                "filename": "pyplearnr-1.0.10-py2-none-any.whl",
                "has_sig": false,
                "md5_digest": "124107649a8703b0ae1ea5a9067fed8e",
                "packagetype": "bdist_wheel",
                "python_version": "py2",
                "requires_python": null,
                "size": 33157,
                "upload_time": "2017-07-01T07:27:07",
                "url": "https://files.pythonhosted.org/packages/0e/bb/3b08a00e10869a429cc0724acc0a68dac2353d0edff2b3671a8afbf3ccdc/pyplearnr-1.0.10-py2-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "8046ca3bd5bce437db4fd79d2e99f89b",
                    "sha256": "aa9c9746140f296e38b55040fc297d0d3efd6928de094ef3a820cf4986f6ef01"
                },
                "downloads": -1,
                "filename": "pyplearnr-1.0.10.tar.gz",
                "has_sig": false,
                "md5_digest": "8046ca3bd5bce437db4fd79d2e99f89b",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 27663,
                "upload_time": "2017-07-01T07:36:49",
                "url": "https://files.pythonhosted.org/packages/a3/51/ca7fa7b0ed8b0f6d4e69106d22a2b5f1b0c02239f7b1236d4a55b3760230/pyplearnr-1.0.10.tar.gz"
            }
        ],
        "1.0.10.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "ea982d753375be98140b53f0b9697bf1",
                    "sha256": "0b7182751bd7a1e42bdbb16e47c67abe65e3c954d50cb5496ff5aa9859ec3054"
                },
                "downloads": -1,
                "filename": "pyplearnr-1.0.10.1-py2-none-any.whl",
                "has_sig": false,
                "md5_digest": "ea982d753375be98140b53f0b9697bf1",
                "packagetype": "bdist_wheel",
                "python_version": "py2",
                "requires_python": null,
                "size": 33048,
                "upload_time": "2017-07-18T05:23:54",
                "url": "https://files.pythonhosted.org/packages/53/0e/5706bd19a33aa56c4c6fd891350ecbda615382bc410130fa9cb70c301068/pyplearnr-1.0.10.1-py2-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "cd7de9a1cf66174bb527669976e264e6",
                    "sha256": "711904f823f747bbcbb80950538708551ab7793388dd2ba25c368d1665f23260"
                },
                "downloads": -1,
                "filename": "pyplearnr-1.0.10.1.tar.gz",
                "has_sig": false,
                "md5_digest": "cd7de9a1cf66174bb527669976e264e6",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 27523,
                "upload_time": "2017-07-18T05:23:58",
                "url": "https://files.pythonhosted.org/packages/40/b3/0e60412ed0d17dd2ec809d066653561c89e4fac0c5f7ed6add67548c1204/pyplearnr-1.0.10.1.tar.gz"
            }
        ],
        "1.0.11": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "f3043873bc82a55fffbef9603a93cfc6",
                    "sha256": "17a7794a4199a1cb83f5b636e0d4626e04f9d7ef7705e13242059e5be7bf9fe6"
                },
                "downloads": -1,
                "filename": "pyplearnr-1.0.11-py2-none-any.whl",
                "has_sig": false,
                "md5_digest": "f3043873bc82a55fffbef9603a93cfc6",
                "packagetype": "bdist_wheel",
                "python_version": "py2",
                "requires_python": null,
                "size": 33022,
                "upload_time": "2017-07-18T05:23:56",
                "url": "https://files.pythonhosted.org/packages/c3/98/4fbf58a8c57dd149187b0c52f14103f9b7cc6e687ec0568eaaad617a95fb/pyplearnr-1.0.11-py2-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "12b482c426da1972b5c9f925c956b16b",
                    "sha256": "b3c75809542206e4bda4f04b5d31a7ed13d996202879d0f3565f2822393cbc60"
                },
                "downloads": -1,
                "filename": "pyplearnr-1.0.11.tar.gz",
                "has_sig": false,
                "md5_digest": "12b482c426da1972b5c9f925c956b16b",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 27528,
                "upload_time": "2017-07-18T05:23:59",
                "url": "https://files.pythonhosted.org/packages/1c/12/a2d8a44249d8c72f404b1b8308ddbd8347fed2f621690dc3046cc60c5070/pyplearnr-1.0.11.tar.gz"
            }
        ],
        "1.0.11.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "6411fe90c591cbf3b2d67b4e082b64a8",
                    "sha256": "b9e4ff9e79c1cea7c0ffe35e1964e4d233ffaafe0c5cb6cdcca0ee568b64cfc6"
                },
                "downloads": -1,
                "filename": "pyplearnr-1.0.11.1-py2-none-any.whl",
                "has_sig": false,
                "md5_digest": "6411fe90c591cbf3b2d67b4e082b64a8",
                "packagetype": "bdist_wheel",
                "python_version": "py2",
                "requires_python": null,
                "size": 34265,
                "upload_time": "2017-08-04T22:52:09",
                "url": "https://files.pythonhosted.org/packages/26/10/edc08c7939b9d7f9db09d4a1c39e017d8615bbd5add20c9ec9da0a498d35/pyplearnr-1.0.11.1-py2-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "7d4ac3e16caefddcc655f27f5e56db19",
                    "sha256": "92594cc4d70f314eb6d5c5889a321acfca803a220fdbbb7269d1be965e137ab7"
                },
                "downloads": -1,
                "filename": "pyplearnr-1.0.11.1.tar.gz",
                "has_sig": false,
                "md5_digest": "7d4ac3e16caefddcc655f27f5e56db19",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 28568,
                "upload_time": "2017-08-04T22:52:11",
                "url": "https://files.pythonhosted.org/packages/12/76/c73a277b8271545d884bfe0b3481d19cbc5c1a118430858b5027194204a8/pyplearnr-1.0.11.1.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "6411fe90c591cbf3b2d67b4e082b64a8",
                "sha256": "b9e4ff9e79c1cea7c0ffe35e1964e4d233ffaafe0c5cb6cdcca0ee568b64cfc6"
            },
            "downloads": -1,
            "filename": "pyplearnr-1.0.11.1-py2-none-any.whl",
            "has_sig": false,
            "md5_digest": "6411fe90c591cbf3b2d67b4e082b64a8",
            "packagetype": "bdist_wheel",
            "python_version": "py2",
            "requires_python": null,
            "size": 34265,
            "upload_time": "2017-08-04T22:52:09",
            "url": "https://files.pythonhosted.org/packages/26/10/edc08c7939b9d7f9db09d4a1c39e017d8615bbd5add20c9ec9da0a498d35/pyplearnr-1.0.11.1-py2-none-any.whl"
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "7d4ac3e16caefddcc655f27f5e56db19",
                "sha256": "92594cc4d70f314eb6d5c5889a321acfca803a220fdbbb7269d1be965e137ab7"
            },
            "downloads": -1,
            "filename": "pyplearnr-1.0.11.1.tar.gz",
            "has_sig": false,
            "md5_digest": "7d4ac3e16caefddcc655f27f5e56db19",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 28568,
            "upload_time": "2017-08-04T22:52:11",
            "url": "https://files.pythonhosted.org/packages/12/76/c73a277b8271545d884bfe0b3481d19cbc5c1a118430858b5027194204a8/pyplearnr-1.0.11.1.tar.gz"
        }
    ]
}