{
    "info": {
        "author": "Simon Larsson",
        "author_email": "simonlarsson0@gmail.com",
        "bugtrack_url": null,
        "classifiers": [
            "License :: OSI Approved :: MIT License",
            "Programming Language :: Python :: 3 :: Only",
            "Topic :: Scientific/Engineering"
        ],
        "description": "# extrakit-learn\n\n[![PyPI version](https://badge.fury.io/py/xklearn.svg)](https://pypi.python.org/pypi/xklearn/) \n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/simon-larsson/extrakit-learn/blob/master/LICENSE)\n\nMachine learnings components built to extend scikit-learn. All components use scikit's [object API](https://scikit-learn.org/stable/developers/contributing.html#apis-of-scikit-learn-objects) to work interchangably with scikit components. It is mostly a collection of tools that have been useful for [Kaggle](https://www.kaggle.com) competitions. extrakit-learn is in no way affiliated with scikit-learn in anyway, just inspired by it.\n\n## Installation\n\n    pip install xklearn\n\n## Components\n- [CategoryEncoder](https://github.com/simon-larsson/extrakit-learn#categoryencoder) - Like scikit's LabelEncoder but supports NaNs and unseen values.\n- [CountEncoder](https://github.com/simon-larsson/extrakit-learn#countencoder) - Categorical feature engineering on a column based on value counts.\n- [TargetEncoder](https://github.com/simon-larsson/extrakit-learn#targetencoder) - Categorical feature engineering on a column based on target means.\n- [MultiColumnEncoder](https://github.com/simon-larsson/extrakit-learn#multicolumnencoder) - Apply a column encoder to multiple columns.\n- [FoldEstimator](https://github.com/simon-larsson/extrakit-learn#foldestimator) - K-fold on scikit estimator wrapped into an estimator.\n- [FoldLightGBM](https://github.com/simon-larsson/extrakit-learn#foldlightgbm) - K-fold on LGBM wrapped into an estimator.\n- [FoldXGBoost](https://github.com/simon-larsson/extrakit-learn#foldxgboost) - K-fold on XGBoost wrapped into an estimator.\n- [StackClassifier](https://github.com/simon-larsson/extrakit-learn#stackclassifier) - Stack an ensemble of classifiers with a meta classifier.\n- [StackRegressor](https://github.com/simon-larsson/extrakit-learn#stackregressor) - Stack an ensemble of regressors with a meta regressor.\n- [compress_dataframe](https://github.com/simon-larsson/extrakit-learn#compress_dataframe) - Reduce memory of a Pandas dataframe.\n\n### Hierachy\n    xklearn\n    \u00e2\u201d\u201a\n    \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac preprocessing\n    \u00e2\u201d\u201a   \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac CategoryEncoder\n    \u00e2\u201d\u201a   \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac CountEncoder\n    \u00e2\u201d\u201a   \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac TargetEncoder      \n    \u00e2\u201d\u201a   \u00e2\u201d\u201d\u00e2\u201d\u20ac\u00e2\u201d\u20ac MultiColumnEncoder\n    \u00e2\u201d\u201a\n    \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac models\n    \u00e2\u201d\u201a   \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac FoldEstimator\n    \u00e2\u201d\u201a   \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac FoldLightGBM\n    |   \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac FoldXGBoost\n    |   \u00e2\u201d\u0153\u00e2\u201d\u20ac\u00e2\u201d\u20ac StackClassifier\n    |   \u00e2\u201d\u201d\u00e2\u201d\u20ac\u00e2\u201d\u20ac StackRegressor\n    |\n    \u00e2\u201d\u201d\u00e2\u201d\u20ac\u00e2\u201d\u20ac utils\n\n##### Example\n\n    from xklearn.models import FoldEstimator\n\n### CategoryEncoder\nWraps scikit's LabelEncoder, allowing missing and unseen values to be handled.\n\n#### Arguments\n`unseen` - Strategy for handling unseen values. See replacement strategies below for options.\n\n`missing` - Strategy for handling missing values. See replacement strategies below for options.\n\n##### Replacement strategies\n\n`'encode'` - Replace value with -1.\n\n`'nan'` - Replace value with np.nan.\n\n`'error'` - Raise ValueError.\n\n#### Example\n```python\nfrom xklearn.preprocessing import CategoryEncoder\n...\n\nce = CategoryEncoder(unseen='nan', missing='nan')\nX[:, 0] = ce.fit_transform(X[:, 0])\n```\n\n### CountEncoder\nReplaces categorical values with their respective value count during training. Classes with a count of one and previously unseen classes during prediction are encoded as either one or NaN.\n\n#### Arguments\n`unseen` - Strategy for handling unseen values. See replacement strategies below for options.\n\n`missing` - Strategy for handling missing values. See replacement strategies below for options.\n\n##### Replacement strategies\n\n`'one'` - Replace value with 1.\n\n`'nan'` - Replace value with np.nan.\n\n`'error'` - Raise ValueError.\n\n#### Example\n```python\nfrom xklearn.preprocessing import CountEncoder\n...\n\nce = CountEncoder(unseen='one')\nX[:, 0] = ce.fit_transform(X[:, 0])\n```\n\n### TargetEncoder\nPerforms target mean encoding of categorical features with optional smoothing.\n\n#### Arguments\n`smoothing` - Smoothing weight.\n\n`unseen` - Strategy for handling unseen values. See replacement strategies below for options.\n\n`missing` - Strategy for handling missing values. See replacement strategies below for options.\n\n##### Replacement strategies\n\n`'global'` - Replace value with global target mean.\n\n`'nan'` - Replace value with np.nan.\n\n`'error'` - Raise ValueError.\n\n#### Example\n\n```python\nfrom xklearn.preprocessing import TargetEncoder\n...\n\nte = TargetEncoder(smoothing=10)\nX[:, 0] = te.fit_transform(X[:, 0], y)\n```\n\n### MultiColumnEncoder\nApplies a column encoder over multiple columns.\n\n#### Arguments\n`enc` - Base encoder that will be applied to selected columns\n\n`columns` - Column selection, either bool-mask, indices or None (default=None).\n\n#### Example\n```python\nfrom xklearn.preprocessing import CountEncoder\nfrom xklearn.preprocessing import MultiColumnEncoder\n...\n\ncolumns = [1, 3, 4]\nenc = CountEncoder()\n\nmce = MultiColumnEncoder(enc, columns)\nX = mce.fit_transform(X)\n```\n\n### FoldEstimator\nK-fold wrapped into an estimator that performs cross validation over a selected folding method automatically when fit. Can optionally be used as a stacked ensemble of k estimators after fit.\n\n#### Arguments\n`est` - Base estimator.\n\n`fold` - Folding cross validation object, i.e KFold and StratifedKfold.\n\n`metric` - Evaluation metric.\n\n`refit_full` - Flag indicting post fit behaviour. True will do a full refit on the full data, False will make it a stacked ensemble trained on the different folds.\n\n`verbose` - Flag for printing fold scores during fit.\n\n#### Example\n```python\nfrom xklearn.models import FoldEstimator\n...\n\nbase = RandomForestRegressor(n_estimators=10)\nfold = KFold(n_splits=5)\n\nest = FoldEstimator(base, fold=fold, metric=mean_squared_error, verbose=1)\n\nest.fit(X_train, y_train)\nest.predict(X_test)\n```\nOutput:\n```\nFinished fold 1 with score: 200.8023\nFinished fold 2 with score: 261.2365\nFinished fold 3 with score: 169.2404\nFinished fold 4 with score: 186.7915\nFinished fold 5 with score: 205.0894\nFinished with a total score of: 204.6813\n```\n\n### FoldLightGBM\nK-fold wrapped into an estimator that performs cross validation on a LGBM over a selected folding method automatically when fit. Can optionally be used as a stacked ensemble of k estimators after fit.\n\n#### Arguments\n`lgbm` - Base estimator.\n\n`fold` - Folding cross validation object, i.e KFold and StratifedKfold.\n\n`metric` - Evaluation metric.\n\n`fit_params` - Dictionary of parameter that should be fed to the fit method.\n\n`refit_full` - Flag indicting post fit behaviour. True will do a full refit on the full data, False will make it a stacked ensemble trained on the different folds.\n\n`refit_params` - Dictionary of parameter that should be fed to the refit if refit_full=False.\n\n`verbose` - Flag for printing fold scores during fit.\n\n#### Example\n```python\nfrom xklearn.models import FoldLightGBM\n...\n\nbase = LGBMClassifier(n_estimators=1000)\nfold = KFold(n_splits=5)\nfit_params = {'eval_metric': 'auc',\n              'early_stopping_rounds': 50,\n              'verbose': 0}\n              \nfold_lgbm = FoldLightGBM(base, \n                         fold=fold, \n                         metric=roc_auc_score,\n                         fit_params=fit_params,\n                         verbose=1)\n               \nfold_lgbm.fit(X_train, y_train)\nfold_lgbm.predict(X_test)\n```\nOutput:\n```\nFinished fold 1 with score: 0.9114\nFinished fold 2 with score: 0.9265\nFinished fold 3 with score: 0.9419\nFinished fold 4 with score: 0.9189\nFinished fold 5 with score: 0.9152\nFinished with a total score of: 0.9225\n```\n\n### FoldXGBoost\nK-fold wrapped into an estimator that performs cross validation on a XGBoost over a selected folding method automatically when fit. Can optionally be used as a stacked ensemble of k estimators after fit.\n\n#### Arguments\n`xgb` - Base estimator.\n\n`fold` - Folding cross validation object, i.e KFold and StratifedKfold.\n\n`metric` - Evaluation metric.\n\n`fit_params` - Dictionary of parameter that should be fed to the fit method.\n\n`refit_full` - Flag indicting post fit behaviour. True will do a full refit on the full data, False will make it a stacked ensemble trained on the different folds.\n\n`refit_params` - Dictionary of parameter that should be fed to the refit if refit_full=False.\n\n`verbose` - Flag for printing fold scores during fit.\n\n#### Example\n```python\nfrom xklearn.models import FoldXGBoost\n...\n\nbase = XGBRegressor(objective=\"reg:linear\", random_state=42)\nfold = KFold(n_splits=5)\nfit_params = {'eval_metric': 'mse',\n              'early_stopping_rounds': 5,\n              'verbose': 0}\n              \nfold_xgb = FoldXGBoost(base, \n                       fold=fold, \n                       metric=mean_squared_error,\n                       fit_params=fit_params,\n                       verbose=1)\n               \nfold_xgb.fit(X_train, y_train)\nfold_xgb.predict(X_test)\n```\nOutput:\n```\nFinished fold 1 with score: 3212.8362\nFinished fold 2 with score: 2179.7843\nFinished fold 3 with score: 2707.8460\nFinished fold 4 with score: 2988.6643\nFinished fold 5 with score: 3281.4299\nFinished with a total score of: 3274.9001\n```\n\n### StackClassifier\nEnsemble classifier that stacks an ensemble of classifiers by using their outputs as input features.\n\n#### Arguments\n`clfs` - List of ensemble of classifiers.\n\n`meta_clf` - Meta classifier that stacks the predictions of the ensemble.\n\n`keep_features` - Flag to train the meta classifier on the original features too.\n\n`refit` - Flag to retrain the ensemble of classifiers during fit.\n\n#### Example\n```python\nfrom xklearn.models import StackClassifier\n...\n\nmeta_clf = RidgeClassifier()\nensemble = [RandomForestClassifier(), KNeighborsClassifier(), SVC()]\n\nstack_clf = StackClassifier(clfs=ensemble, meta_clf=meta_clf, refit=True)\n\nstack_clf.fit(X_train, y_train)\ny_ = stack_clf.predict(X_test)\n```\n\n### StackRegressor\nEnsemble regressor that stacks an ensemble of regressors by using their outputs as input features.\n\n#### Arguments\n`regs` - List of ensemble of regressors.\n\n`meta_reg` - Meta regressor that stacks the predictions of the ensemble.\n\n`drop_first` : Drop first class probability to avoid multi-collinearity.\n\n`keep_features` - Flag to train the meta regressor on the original features too.\n\n`refit` - Flag to retrain the ensemble of regressors during fit.\n\n#### Example\n```python\nfrom xklearn.models import StackRegressor\n...\n\nmeta_reg = RidgeRegressor()\nensemble = [RandomForestRegressor(), KNeighborsRegressor(), SVR()]\n\nstack_reg = StackRegressor(regs=ensemble, meta_reg=meta_reg, refit=True)\n\nstack_reg.fit(X_train, y_train)\ny_ = stack_reg.predict(X_test)\n```\n\n### compress_dataframe\nReduce memory usage of a Pandas dataframe by finding columns that use larger variable types than unnecessary.\n\n#### Arguments\n`df` - Dataframe for memory reduction.\n\n`verbose` - Flag for printing result of memory reduction.\n\n#### Example\n```python\nfrom xklearn.utils import compress_dataframe\n...\n\ntrain = compress_dataframe(train, verbose=1)\n```\nOutput:\n```\nDataframe memory decreased to 169.60 MB (64.6% reduction)\n```",
        "description_content_type": "text/markdown",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/simon-larsson/extrakit-learn",
        "keywords": "",
        "license": "MIT",
        "maintainer": "",
        "maintainer_email": "",
        "name": "xklearn",
        "package_url": "https://pypi.org/project/xklearn/",
        "platform": "",
        "project_url": "https://pypi.org/project/xklearn/",
        "project_urls": {
            "Homepage": "https://github.com/simon-larsson/extrakit-learn"
        },
        "release_url": "https://pypi.org/project/xklearn/0.0.7/",
        "requires_dist": null,
        "requires_python": "",
        "summary": "Handy machine learning tools in the spirit of scikit-learn.",
        "version": "0.0.7"
    },
    "last_serial": 5813768,
    "releases": {
        "0.0.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "25cb19fd9c8370765f8efd55985c5dc5",
                    "sha256": "34d6180bccbaf7785482f2b57e3c36e18c43136e158156167eeff17b1c8e7445"
                },
                "downloads": -1,
                "filename": "xklearn-0.0.1-py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "25cb19fd9c8370765f8efd55985c5dc5",
                "packagetype": "bdist_wheel",
                "python_version": "py3",
                "requires_python": null,
                "size": 14419,
                "upload_time": "2019-05-08T17:02:12",
                "url": "https://files.pythonhosted.org/packages/57/6d/e8200d38c36f44c9dd8379558fa11313498f6d11608348b8d6a854187d34/xklearn-0.0.1-py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "041576d5ada8e48902564e3d5b04f7f2",
                    "sha256": "ec22a936b22e8e0e99e603ae694b891af5c4e405fd800421adf1e7dd75c7ef5e"
                },
                "downloads": -1,
                "filename": "xklearn-0.0.1.tar.gz",
                "has_sig": false,
                "md5_digest": "041576d5ada8e48902564e3d5b04f7f2",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 9924,
                "upload_time": "2019-05-08T17:02:14",
                "url": "https://files.pythonhosted.org/packages/17/26/d9b3f2ba2bd2fa25d8a3321ec0bbcb7c065cea6b847c37708876a40e68ba/xklearn-0.0.1.tar.gz"
            }
        ],
        "0.0.2": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "944f1deaa90c7f3238178e9fecf1d6c7",
                    "sha256": "93bae38941c8bee9b3dc37faf2690f8ce8c0b5da2c43dfdaaaac56eaa32e4e93"
                },
                "downloads": -1,
                "filename": "xklearn-0.0.2.tar.gz",
                "has_sig": false,
                "md5_digest": "944f1deaa90c7f3238178e9fecf1d6c7",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 12466,
                "upload_time": "2019-05-13T13:37:12",
                "url": "https://files.pythonhosted.org/packages/e0/5e/72306118d3bb0e631b4c59a7b13fcc6edf5e6733059983184fd9754d07c5/xklearn-0.0.2.tar.gz"
            }
        ],
        "0.0.3": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "92bfd63abcf8643e11179a04a00cd946",
                    "sha256": "97c6090764dd540af096c148f6b445dc26e18d76c21648c2e17460a80bab10e7"
                },
                "downloads": -1,
                "filename": "xklearn-0.0.3.tar.gz",
                "has_sig": false,
                "md5_digest": "92bfd63abcf8643e11179a04a00cd946",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 12455,
                "upload_time": "2019-05-13T13:42:22",
                "url": "https://files.pythonhosted.org/packages/87/40/b4053fb53d425ce02832a17672002394bfc93bfe57e2425c006bf06efefe/xklearn-0.0.3.tar.gz"
            }
        ],
        "0.0.4": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "457a61746fe097af3ec28acf9d958d59",
                    "sha256": "1f09d058036a2433610f33ffb7ef639612d3ca5801f86c9f9fb3dde28ec4d1e3"
                },
                "downloads": -1,
                "filename": "xklearn-0.0.4.tar.gz",
                "has_sig": false,
                "md5_digest": "457a61746fe097af3ec28acf9d958d59",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 14131,
                "upload_time": "2019-05-14T19:14:47",
                "url": "https://files.pythonhosted.org/packages/48/62/22d26635216f7d8bc5914fb280c8bd064177c7f52c73aef2f28e0841752e/xklearn-0.0.4.tar.gz"
            }
        ],
        "0.0.5": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "46a522189b6fd5aa061800979198608a",
                    "sha256": "65f2e1de05db6675f3f9bd94268b1a7b9aeb0cec38e2900ff72315d59acaec2a"
                },
                "downloads": -1,
                "filename": "xklearn-0.0.5.tar.gz",
                "has_sig": false,
                "md5_digest": "46a522189b6fd5aa061800979198608a",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 14302,
                "upload_time": "2019-06-19T12:57:29",
                "url": "https://files.pythonhosted.org/packages/74/40/129fd015b844321d4ee6948919c96099f0fb466e59c0305fa30d9bd3cb6a/xklearn-0.0.5.tar.gz"
            }
        ],
        "0.0.6": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "e8008face3e177e7a99d27a5bc455be9",
                    "sha256": "e29fd82200123c00738632cfb8e1cb562d1e592b63f021c77b03706c4e66d700"
                },
                "downloads": -1,
                "filename": "xklearn-0.0.6.tar.gz",
                "has_sig": false,
                "md5_digest": "e8008face3e177e7a99d27a5bc455be9",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 15298,
                "upload_time": "2019-09-10T16:34:16",
                "url": "https://files.pythonhosted.org/packages/0f/8f/b6496ad157fd879f5b9d616b6c87d2d74b5f13da54d04c66a465b4eb9bbf/xklearn-0.0.6.tar.gz"
            }
        ],
        "0.0.7": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "cde33481a71c1cc95377800208e03725",
                    "sha256": "916e0a7618b8bdf8988de7df5edc5e93ef3ebab609d0f46c1082a60988a6befb"
                },
                "downloads": -1,
                "filename": "xklearn-0.0.7.tar.gz",
                "has_sig": false,
                "md5_digest": "cde33481a71c1cc95377800208e03725",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 14990,
                "upload_time": "2019-09-11T08:59:03",
                "url": "https://files.pythonhosted.org/packages/cd/d5/b4f65e1c390fe013266fb8132c22db607fe31ca35348886a9ccfc2f5e0cb/xklearn-0.0.7.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "cde33481a71c1cc95377800208e03725",
                "sha256": "916e0a7618b8bdf8988de7df5edc5e93ef3ebab609d0f46c1082a60988a6befb"
            },
            "downloads": -1,
            "filename": "xklearn-0.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "cde33481a71c1cc95377800208e03725",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 14990,
            "upload_time": "2019-09-11T08:59:03",
            "url": "https://files.pythonhosted.org/packages/cd/d5/b4f65e1c390fe013266fb8132c22db607fe31ca35348886a9ccfc2f5e0cb/xklearn-0.0.7.tar.gz"
        }
    ]
}