{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved", "Operating System :: MacOS", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX", "Operating System :: Unix", "Programming Language :: Python", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering", "Topic :: Software Development" ], "description": ".. -*- mode: rst -*-\n\n.. _scikit-learn: http://scikit-learn.org/stable/\n\n.. _imbalanced-learn: http://imbalanced-learn.org/en/stable/\n\n|Travis|_ |AppVeyor|_ |Codecov|_ |CircleCI|_ |ReadTheDocs|_ |PythonVersion|_ |Pypi|_ |Conda|_ |DOI|_ |Black|_\n\n.. |Travis| image:: https://travis-ci.org/AlgoWit/cluster-over-sampling.svg?branch=master\n.. _Travis: https://travis-ci.org/AlgoWit/cluster-over-sampling\n\n.. |AppVeyor| image:: https://ci.appveyor.com/api/projects/status/fnhxhlv16ovfhlyw/branch/master?svg=true\n.. _AppVeyor: https://ci.appveyor.com/project/georgedouzas/cluster-over-sampling/history\n\n.. |Codecov| image:: https://codecov.io/gh/AlgoWit/cluster-over-sampling/branch/master/graph/badge.svg\n.. _Codecov: https://codecov.io/gh/AlgoWit/cluster-over-sampling\n\n.. |CircleCI| image:: https://circleci.com/gh/AlgoWit/cluster-over-sampling/tree/master.svg?style=svg\n.. _CircleCI: https://circleci.com/gh/AlgoWit/cluster-over-sampling/tree/master\n\n.. |ReadTheDocs| image:: https://readthedocs.org/projects/cluster-over-sampling/badge/?version=latest\n.. _ReadTheDocs: https://cluster-over-sampling.readthedocs.io/en/latest/?badge=latest\n\n.. |PythonVersion| image:: https://img.shields.io/pypi/pyversions/cluster-over-sampling.svg\n.. _PythonVersion: https://img.shields.io/pypi/pyversions/cluster-over-sampling.svg\n\n.. |Pypi| image:: https://badge.fury.io/py/cluster-over-sampling.svg\n.. _Pypi: https://badge.fury.io/py/cluster-over-sampling\n\n.. |Conda| image:: https://anaconda.org/algowit/cluster-over-sampling/badges/installer/conda.svg\n.. _Conda: https://conda.anaconda.org/algowit\n\n.. |DOI| image:: https://zenodo.org/badge/DOI/10.1016/j.eswa.2017.03.073.svg\n.. _DOI: https://doi.org/10.1016/j.eswa.2017.03.073\n\n.. |Black| image:: https://img.shields.io/badge/code%20style-black-000000.svg\n.. _Black: https://github.com/ambv/black\n\n=====================\ncluster-over-sampling\n=====================\n\nImplementation of a general interface for clustering based over-sampling\nalgorithms [1]_, [2]_. It is compatible with scikit-learn_ and\nimbalanced-learn_.\n\nDocumentation\n-------------\n\nInstallation documentation, API documentation, and examples can be found on the\ndocumentation_.\n\n.. _documentation: https://cluster-over-sampling.readthedocs.io/en/latest/\n\nDependencies\n------------\n\ncluster-over-sampling is tested to work under Python 3.6+. The dependencies\nare the following:\n\n- numpy(>=1.1)\n- scikit-learn(>=0.21)\n- imbalanced-learn(>=0.4.3)\n\nAdditionally, to run the examples, you need matplotlib(>=2.0.0) and\npandas(>=0.22).\n\nInstallation\n------------\n\ncluster-over-sampling is currently available on the PyPi's repository\nand you can install it via `pip`::\n\n pip install -U cluster-over-sampling\n\nThe package is released also in Anaconda Cloud platform::\n\n conda install -c algowit cluster-over-sampling\n\nIf you prefer, you can clone it and run the setup.py file. Use the following\ncommands to get a copy from GitHub and install all dependencies::\n\n git clone https://github.com/AlgoWit/cluster-over-sampling.git\n cd cluster-over-sampling\n pip install .\n\nOr install using pip and GitHub::\n\n pip install -U git+https://github.com/AlgoWit/cluster-over-sampling.git\n\nTesting\n-------\n\nAfter installation, you can use `pytest` to run the test suite::\n\n make test\n\nAbout\n-----\n\nIf you use cluster-over-sampling in a scientific publication, we would\nappreciate citations to any of the following papers::\n\n @article{Douzas2017,\n doi = {10.1016/j.eswa.2017.03.073},\n url = {https://doi.org/10.1016/j.eswa.2017.03.073},\n year = {2017},\n month = oct,\n publisher = {Elsevier {BV}},\n volume = {82},\n pages = {40--52},\n author = {Georgios Douzas and Fernando Bacao},\n title = {Self-Organizing Map Oversampling ({SOMO}) for imbalanced data set learning},\n journal = {Expert Systems with Applications}\n }\n\n @article{Douzas2018,\n doi = {10.1016/j.ins.2018.06.056},\n url = {https://doi.org/10.1016/j.ins.2018.06.056},\n year = {2018},\n month = oct,\n publisher = {Elsevier {BV}},\n volume = {465},\n pages = {1--20},\n author = {Georgios Douzas and Fernando Bacao and Felix Last},\n title = {Improving imbalanced learning through a heuristic oversampling method based on k-means and {SMOTE}},\n journal = {Information Sciences}\n }\n\nLearning from class-imbalanced data continues to be a common and challenging\nproblem in supervised learning as standard classification algorithms are\ndesigned to handle balanced class distributions. While different strategies\nexist to tackle this problem, methods which generate artificial data to achieve\na balanced class distribution are more versatile than modifications to the\nclassification algorithm. SMOTE algorithm [3]_, as well as any other\nover-sampling method based on the SMOTE mechanism, generates synthetic samples\nalong line segments that join minority class instances. SMOTE addresses only\nthe issue of between-classes imbalance. On the other hand, by clustering the\ninput space and applying any over-sampling algorithm for each resulting cluster\nwith appropriate resampling ratio, the within-classes imbalanced issue can be\naddressed.\n\nReferences:\n-----------\n\n.. [1] G. Douzas, F. Bacao, \"Self-Organizing Map Oversampling (SOMO)\n for imbalanced data set learning\", Expert Systems with Applications,\n vol. 82, pp. 40-52, 2017.\n\n.. [2] G. Douzas, F. Bacao, F. Last, \"Improving imbalanced learning\n through a heuristic oversampling method based on k-means and SMOTE\",\n Information Sciences, vol. 465, pp. 1-20, 2018.\n\n.. [3] N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, \"SMOTE:\n synthetic minority over-sampling technique\", Journal of Artificial\n Intelligence Research, vol. 16, pp. 321-357, 2002.\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "https://github.com/AlgoWit/cluster-over-sampling", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/AlgoWit/cluster-over-sampling", "keywords": "", "license": "MIT", "maintainer": "G. Douzas", "maintainer_email": "gdouzas@icloud.com", "name": "cluster-over-sampling", "package_url": "https://pypi.org/project/cluster-over-sampling/", "platform": "", "project_url": "https://pypi.org/project/cluster-over-sampling/", "project_urls": { "Download": "https://github.com/AlgoWit/cluster-over-sampling", "Homepage": "https://github.com/AlgoWit/cluster-over-sampling" }, "release_url": "https://pypi.org/project/cluster-over-sampling/0.1.1/", "requires_dist": [ "scipy (>=0.17)", "numpy (>=1.1)", "scikit-learn (>=0.21)", "imbalanced-learn (>=0.4.3)", "sphinx (==1.8.5) ; extra == 'docs'", "sphinx-gallery ; extra == 'docs'", "sphinx-rtd-theme ; extra == 'docs'", "numpydoc ; extra == 'docs'", "matplotlib ; extra == 'docs'", "pandas ; extra == 'docs'", "pytest ; extra == 'tests'", "pytest-cov ; extra == 'tests'" ], "requires_python": "", "summary": "Clustering based over-sampling.", "version": "0.1.1" }, "last_serial": 5692951, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "0d1e17fb05cde5474dfce0ef7babbe11", "sha256": "3837c327f7780601fa07f25ad8284c88598a7ba1ef2b63b2c15ed71e4d8e3810" }, "downloads": -1, "filename": "cluster_over_sampling-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "0d1e17fb05cde5474dfce0ef7babbe11", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20928, "upload_time": "2019-08-17T14:36:36", "url": "https://files.pythonhosted.org/packages/48/db/baa897164f4c86d1c4fa6e2b1e0321aafaaa9b3f881b8728b37185556e38/cluster_over_sampling-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ccf14c7bc2e740780349ff6da2db3758", "sha256": "1f2e273c8eec1562ddaa9d8902ff0959292fb1f8edb4c480b83db569fe58db80" }, "downloads": -1, "filename": "cluster-over-sampling-0.1.0.tar.gz", "has_sig": false, "md5_digest": "ccf14c7bc2e740780349ff6da2db3758", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9884422, "upload_time": "2019-08-17T14:39:30", "url": "https://files.pythonhosted.org/packages/eb/0a/75be7a7aad1e0832ec140189dd2be188ec5bb0dcc9f8613ae96a87474a7a/cluster-over-sampling-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "5fbc35f9d749f976f243cb403486c0a2", "sha256": "d8c310335dfd154c3075112404ecad3a0959ea3f16b27d14a1337e79b1d5f7f7" }, "downloads": -1, "filename": "cluster_over_sampling-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "5fbc35f9d749f976f243cb403486c0a2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20919, "upload_time": "2019-08-17T22:51:51", "url": "https://files.pythonhosted.org/packages/7b/c3/1d7ea8968f8ef33dd2559c5d0f5c351c64177705bfd451dba94307904b80/cluster_over_sampling-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ddf110a21a081594741ed578369710dd", "sha256": "03832a637ed52abe933925a0f0fb070126ee1085b6c5cf90d9a227392448efae" }, "downloads": -1, "filename": "cluster-over-sampling-0.1.1.tar.gz", "has_sig": false, "md5_digest": "ddf110a21a081594741ed578369710dd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9884497, "upload_time": "2019-08-17T22:53:23", "url": "https://files.pythonhosted.org/packages/d6/ff/5359a80c153c66126c6054c498d36068c553776d0a2d8084ceb1cc77cab0/cluster-over-sampling-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5fbc35f9d749f976f243cb403486c0a2", "sha256": "d8c310335dfd154c3075112404ecad3a0959ea3f16b27d14a1337e79b1d5f7f7" }, "downloads": -1, "filename": "cluster_over_sampling-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "5fbc35f9d749f976f243cb403486c0a2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20919, "upload_time": "2019-08-17T22:51:51", "url": "https://files.pythonhosted.org/packages/7b/c3/1d7ea8968f8ef33dd2559c5d0f5c351c64177705bfd451dba94307904b80/cluster_over_sampling-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ddf110a21a081594741ed578369710dd", "sha256": "03832a637ed52abe933925a0f0fb070126ee1085b6c5cf90d9a227392448efae" }, "downloads": -1, "filename": "cluster-over-sampling-0.1.1.tar.gz", "has_sig": false, "md5_digest": "ddf110a21a081594741ed578369710dd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9884497, "upload_time": "2019-08-17T22:53:23", "url": "https://files.pythonhosted.org/packages/d6/ff/5359a80c153c66126c6054c498d36068c553776d0a2d8084ceb1cc77cab0/cluster-over-sampling-0.1.1.tar.gz" } ] }