{ "info": { "author": "Felix Last", "author_email": "mail@felixlast.de", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Operating System :: MacOS", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX", "Operating System :: Unix", "Programming Language :: Python", "Programming Language :: Python :: 3.6", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Information Analysis" ], "description": "Oversampling for Imbalanced Learning based on K-Means and SMOTE\n---------------------------------------------------------------\n\n|PyPI version| |Build Status| |Docs Status| |codecov|\n\nK-Means SMOTE is an oversampling method for class-imbalanced data. It\naids classification by generating minority class samples in safe and\ncrucial areas of the input space. The method avoids the generation of\nnoise and effectively overcomes imbalances between and within classes.\n\nThis project is a python implementation of k-means SMOTE. It is\ncompatible with the scikit-learn-contrib project\n`imbalanced-learn `__.\n\nInstallation\n------------\n\nDependencies\n~~~~~~~~~~~~\n\nThe implementation is tested under python 3.6 and works with the latest\nrelease of the imbalanced-learn framework:\n\n- imbalanced-learn (>=0.4.0, <0.5)\n- numpy (numpy>=1.13, <1.16)\n- scikit-learn (>=0.19.0, <0.21)\n\n.. _installation-1:\n\nInstallation\n~~~~~~~~~~~~\n\nPypi\n^^^^\n\n.. code:: sh\n\n pip install kmeans-smote\n\nFrom Source\n^^^^^^^^^^^\n\nClone this repository and run the setup.py file. Use the following\ncommands to get a copy from GitHub and install all dependencies:\n\n.. code:: sh\n\n git clone https://github.com/felix-last/kmeans_smote.git\n cd kmeans-smote\n pip install .\n\nDocumentation\n-------------\n\nFind the API documentation at https://kmeans_smote.readthedocs.io. As\nthis project follows the imbalanced-learn API, the `imbalanced-learn\ndocumentation `__\nmight also prove helpful.\n\nExample Usage\n~~~~~~~~~~~~~\n\n.. code:: python\n\n import numpy as np\n from imblearn.datasets import fetch_datasets\n from kmeans_smote import KMeansSMOTE\n\n datasets = fetch_datasets(filter_data=['oil'])\n X, y = datasets['oil']['data'], datasets['oil']['target']\n\n [print('Class {} has {} instances'.format(label, count))\n for label, count in zip(*np.unique(y, return_counts=True))]\n\n kmeans_smote = KMeansSMOTE(\n kmeans_args={\n 'n_clusters': 100\n },\n smote_args={\n 'k_neighbors': 10\n }\n )\n X_resampled, y_resampled = kmeans_smote.fit_sample(X, y)\n\n [print('Class {} has {} instances after oversampling'.format(label, count))\n for label, count in zip(*np.unique(y_resampled, return_counts=True))]\n\nExpected Output:\n\n::\n\n Class -1 has 896 instances\n Class 1 has 41 instances\n Class -1 has 896 instances after oversampling\n Class 1 has 896 instances after oversampling\n\nTake a look at `imbalanced-learn\npipelines `__\nfor efficient usage with cross-validation.\n\nAbout\n-----\n\nK-means SMOTE works in three steps:\n\n1. Cluster the entire input space using k-means [1].\n2. Distribute the number of samples to generate across clusters:\n\n 1. Filter out clusters which have a high number of majority class\n samples.\n 2. Assign more synthetic samples to clusters where minority class\n samples are sparsely distributed.\n\n3. Oversample each filtered cluster using SMOTE [2].\n\nContributing\n~~~~~~~~~~~~\n\nPlease feel free to submit an issue if things work differently than\nexpected. Pull requests are also welcome - just make sure that tests are\ngreen by running ``pytest`` before submitting.\n\nCitation\n~~~~~~~~\n\nIf you use k-means SMOTE in a scientific publication, we would\nappreciate citations to the following\n`paper `__:\n\n::\n\n @article{kmeans_smote,\n title = {Oversampling for Imbalanced Learning Based on K-Means and SMOTE},\n author = {Last, Felix and Douzas, Georgios and Bacao, Fernando},\n year = {2017},\n archivePrefix = \"arXiv\",\n eprint = \"1711.00837\",\n primaryClass = \"cs.LG\"\n }\n\nReferences\n~~~~~~~~~~\n\n[1] MacQueen, J. \u201cSome Methods for Classification and Analysis of\nMultivariate Observations.\u201d Proceedings of the Fifth Berkeley Symposium\non Mathematical Statistics and Probability, 1967, p.\u00a0281-297.\n\n[2] Chawla, Nitesh V., et al. \u201cSMOTE: Synthetic Minority over-Sampling\nTechnique.\u201d Journal of Artificial Intelligence Research, vol.\u00a016,\nJan.\u00a02002, p.\u00a0321357, doi:10.1613/jair.953.\n\n.. |PyPI version| image:: https://badge.fury.io/py/kmeans-smote.svg\n :target: https://badge.fury.io/py/kmeans-smote\n.. |Build Status| image:: https://travis-ci.org/felix-last/kmeans_smote.svg?branch=master\n :target: https://travis-ci.org/felix-last/kmeans_smote\n.. |Docs Status| image:: https://readthedocs.org/projects/kmeans-smote/badge/?version=latest\n :target: http://kmeans-smote.readthedocs.io/en/latest/?badge=latest\n.. |codecov| image:: https://codecov.io/gh/felix-last/kmeans_smote/branch/master/graph/badge.svg\n :target: https://codecov.io/gh/felix-last/kmeans_smote", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/felix-last/kmeans_smote", "keywords": "Class-imbalanced Learning,Oversampling,Classification,Clustering,Supervised Learning", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "kmeans-smote", "package_url": "https://pypi.org/project/kmeans-smote/", "platform": "", "project_url": "https://pypi.org/project/kmeans-smote/", "project_urls": { "Homepage": "https://github.com/felix-last/kmeans_smote" }, "release_url": "https://pypi.org/project/kmeans-smote/0.1.2/", "requires_dist": null, "requires_python": "", "summary": "Oversampling for imbalanced learning based on k-means and SMOTE", "version": "0.1.2" }, "last_serial": 5007951, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "c736eefb034c20c968620dff3f17dc89", "sha256": "ff2582fd3e0d66b43e639ea0c20a76a7b2ac21789004ae46f12f4399870a6362" }, "downloads": -1, "filename": "kmeans_smote-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "c736eefb034c20c968620dff3f17dc89", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10580, "upload_time": "2017-10-29T22:13:27", "url": "https://files.pythonhosted.org/packages/d4/bd/adb88a6296f336937f2ddd086c299cb58e2c91291d47569d5f451bbed2ad/kmeans_smote-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e26774664797e2d453c37b34264d9290", "sha256": "795d00c84bdf56948446cdc1a8eac96f145ff9f10e052b780d08f10fbd865ef1" }, "downloads": -1, "filename": "kmeans_smote-0.0.1.tar.gz", "has_sig": false, "md5_digest": "e26774664797e2d453c37b34264d9290", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8671, "upload_time": "2017-10-29T18:42:52", "url": "https://files.pythonhosted.org/packages/ea/ef/c670c54a9322bde2eb28a60940f45a11ec9192a761ebd147e36f2646e70c/kmeans_smote-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "ec2903f1f970a0061aa70dd06f9f1026", "sha256": "5794d57395bb45777fb368c1330a388d557522d46f998417a082764c3526d075" }, "downloads": -1, "filename": "kmeans_smote-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "ec2903f1f970a0061aa70dd06f9f1026", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11133, "upload_time": "2017-11-25T21:15:21", "url": "https://files.pythonhosted.org/packages/dc/05/2c129939155422c050c51e0e6ba3aacc7cbc76309cdc682f18a4dc300c11/kmeans_smote-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1c055ffe25f737cabf219430f3dded30", "sha256": "5ec476ce20e274d815b1233ba99adf6c44dcac2b821610b8bff5fdd603c856b6" }, "downloads": -1, "filename": "kmeans_smote-0.0.2.tar.gz", "has_sig": false, "md5_digest": "1c055ffe25f737cabf219430f3dded30", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10439, "upload_time": "2017-11-25T21:15:36", "url": "https://files.pythonhosted.org/packages/64/38/7c8f7b28a1c6c8c8e7474e051624776161c8275aa765adb9080cf30c57d4/kmeans_smote-0.0.2.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "719d24114fb94abff72afd8036241f31", "sha256": "59a6405c756fb72f34395364d76b71232597049fcebfcc8453b844c7caf1f116" }, "downloads": -1, "filename": "kmeans_smote-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "719d24114fb94abff72afd8036241f31", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11094, "upload_time": "2018-02-06T17:07:13", "url": "https://files.pythonhosted.org/packages/44/3a/98ae08909d79f997613a9159923bf3a08477fb6b7d03b7476630aa5612c5/kmeans_smote-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "abf1d80c311b1dd67dd644d6ca3b9687", "sha256": "119bfc17934dff4d6a622a78e4e6f2f5b700fce4336e33e207f3eebdc4483f1c" }, "downloads": -1, "filename": "kmeans_smote-0.1.0.tar.gz", "has_sig": false, "md5_digest": "abf1d80c311b1dd67dd644d6ca3b9687", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11218, "upload_time": "2018-02-06T17:06:47", "url": "https://files.pythonhosted.org/packages/4c/8a/e06e211409b232720abc601cf2e6b122de463235e845547910000b433857/kmeans_smote-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "3c2f49820618a99d8abc3c989fcb5c63", "sha256": "42f5afacbe75c45a9b1795c6ae1960e7658e8258d0d6cd7a5f98564290aebcbb" }, "downloads": -1, "filename": "kmeans_smote-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "3c2f49820618a99d8abc3c989fcb5c63", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8905, "upload_time": "2019-03-30T19:42:41", "url": "https://files.pythonhosted.org/packages/db/61/7743e3f926bc6398c4cd76d2185b399567c6283619fea52f1e0fcb60ec41/kmeans_smote-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "7e27f8496ee687215dc74cdfa885c308", "sha256": "ef0d0f8d7a559ccd7727935732a49ab9a17b5d97be762f145fc0c81f84101f19" }, "downloads": -1, "filename": "kmeans_smote-0.1.1.tar.gz", "has_sig": false, "md5_digest": "7e27f8496ee687215dc74cdfa885c308", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10549, "upload_time": "2019-03-30T19:42:55", "url": "https://files.pythonhosted.org/packages/d6/99/45dd088f6152d3cfd2098bbbcf78d62be1b867c2b68ec933e411b27169ec/kmeans_smote-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "25224e507e484a3e6432313b194dec14", "sha256": "23f18c280082c92831f682494e3e052c7ea78ddf07231727334ad2bc1e871582" }, "downloads": -1, "filename": "kmeans_smote-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "25224e507e484a3e6432313b194dec14", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8915, "upload_time": "2019-03-30T21:12:55", "url": "https://files.pythonhosted.org/packages/77/37/decf75573ea7449ae7d5abf015306fe728fd71af6e6eeecd831e1e5c0150/kmeans_smote-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3c3359c30114caffa2fbafa7071c3522", "sha256": "2abfec42111a38ede1d3adeba0f50efdfce10b0e0bdd9dd9fff2afdd67961684" }, "downloads": -1, "filename": "kmeans_smote-0.1.2.tar.gz", "has_sig": false, "md5_digest": "3c3359c30114caffa2fbafa7071c3522", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10643, "upload_time": "2019-03-30T21:12:41", "url": "https://files.pythonhosted.org/packages/2f/69/081a9a7d25d0555cfecdcf11523c9b3e4225ac6021fd1274675fa893a3c8/kmeans_smote-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "25224e507e484a3e6432313b194dec14", "sha256": "23f18c280082c92831f682494e3e052c7ea78ddf07231727334ad2bc1e871582" }, "downloads": -1, "filename": "kmeans_smote-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "25224e507e484a3e6432313b194dec14", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8915, "upload_time": "2019-03-30T21:12:55", "url": "https://files.pythonhosted.org/packages/77/37/decf75573ea7449ae7d5abf015306fe728fd71af6e6eeecd831e1e5c0150/kmeans_smote-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3c3359c30114caffa2fbafa7071c3522", "sha256": "2abfec42111a38ede1d3adeba0f50efdfce10b0e0bdd9dd9fff2afdd67961684" }, "downloads": -1, "filename": "kmeans_smote-0.1.2.tar.gz", "has_sig": false, "md5_digest": "3c3359c30114caffa2fbafa7071c3522", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10643, "upload_time": "2019-03-30T21:12:41", "url": "https://files.pythonhosted.org/packages/2f/69/081a9a7d25d0555cfecdcf11523c9b3e4225ac6021fd1274675fa893a3c8/kmeans_smote-0.1.2.tar.gz" } ] }