{ "info": { "author": "Nico de Vos", "author_email": "njdevos@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Operating System :: MacOS", "Operating System :: Microsoft :: Windows", "Operating System :: Unix", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering" ], "description": ".. image:: https://img.shields.io/pypi/v/kmodes.svg\n :target: https://pypi.python.org/pypi/kmodes/\n :alt: Version\n.. image:: https://travis-ci.org/nicodv/kmodes.svg?branch=master\n :target: https://travis-ci.org/nicodv/kmodes\n :alt: Test Status\n.. image:: https://coveralls.io/repos/nicodv/kmodes/badge.svg\n :target: https://coveralls.io/r/nicodv/kmodes\n :alt: Test Coverage\n.. image:: https://api.codacy.com/project/badge/Grade/cb19f1f1093a44fa845ebfdaf76975f6\n :alt: Codacy Badge\n :target: https://app.codacy.com/app/nicodv/kmodes?utm_source=github.com&utm_medium=referral&utm_content=nicodv/kmodes&utm_campaign=Badge_Grade_Dashboard\n.. image:: https://requires.io/github/nicodv/kmodes/requirements.svg\n :target: https://requires.io/github/nicodv/kmodes/requirements/\n :alt: Requirements Status\n.. image:: https://img.shields.io/pypi/pyversions/kmodes.svg\n :target: https://pypi.python.org/pypi/kmodes/\n :alt: Supported Python versions\n.. image:: https://img.shields.io/github/stars/nicodv/kmodes.svg\n :target: https://github.com/nicodv/kmodes/\n :alt: Github stars\n.. image:: https://img.shields.io/pypi/l/kmodes.svg\n :target: https://github.com/nicodv/kmodes/blob/master/LICENSE\n :alt: License\n\nkmodes\n======\n\nDescription\n-----------\n\nPython implementations of the k-modes and k-prototypes clustering\nalgorithms. Relies on numpy for a lot of the heavy lifting.\n\nk-modes is used for clustering categorical variables. It defines clusters\nbased on the number of matching categories between data points. (This is\nin contrast to the more well-known k-means algorithm, which clusters\nnumerical data based on Euclidean distance.) The k-prototypes algorithm\ncombines k-modes and k-means and is able to cluster mixed numerical /\ncategorical data.\n\nImplemented are:\n\n- k-modes [HUANG97]_ [HUANG98]_\n- k-modes with initialization based on density [CAO09]_\n- k-prototypes [HUANG97]_\n\nThe code is modeled after the clustering algorithms in :code:`scikit-learn`\nand has the same familiar interface.\n\nI would love to have more people play around with this and give me\nfeedback on my implementation. If you come across any issues in running or\ninstalling kmodes,\n`please submit a bug report `_.\n\nEnjoy!\n\nInstallation\n------------\n\nkmodes can be installed using pip:\n\n.. code:: bash\n\n pip install kmodes\n\nTo upgrade to the latest version (recommended), run it like this:\n\n.. code:: bash\n\n pip install --upgrade kmodes\n\nAlternatively, you can build the latest development version from source:\n\n.. code:: bash\n\n git clone https://github.com/nicodv/kmodes.git\n cd kmodes\n python setup.py install\n\nUsage\n-----\n.. code:: python\n\n import numpy as np\n from kmodes.kmodes import KModes\n\n # random categorical data\n data = np.random.choice(20, (100, 10))\n\n km = KModes(n_clusters=4, init='Huang', n_init=5, verbose=1)\n\n clusters = km.fit_predict(data)\n\n # Print the cluster centroids\n print(km.cluster_centroids_)\n\nThe examples directory showcases simple use cases of both k-modes\n('soybean.py') and k-prototypes ('stocks.py').\n\nMissing / unseen data\n_____________________\n\nThe k-modes algorithm accepts :code:`np.NaN` values as missing values in\nthe :code:`X` matrix. However, users are strongly suggested to consider\nfilling in the missing data themselves in a way that makes sense for\nthe problem at hand. This is especially important in case of many missing\nvalues.\n\nThe k-modes algorithm currently handles missing data as follows. When\nfitting the model, :code:`np.NaN` values are encoded into their own\ncategory (let's call it \"unknown values\"). When predicting, the model\ntreats any values in :code:`X` that (1) it has not seen before during\ntraining, or (2) are missing, as being a member of the \"unknown values\"\ncategory. Simply put, the algorithm treats any missing / unseen data as\nmatching with each other but mismatching with non-missing / seen data\nwhen determining similarity between points.\n\nThe k-prototypes also accepts :code:`np.NaN` values as missing values for\nthe categorical variables, but does *not* accept missing values for the\nnumerical values. It is up to the user to come up with a way of\nhandling these missing data that is appropriate for the problem at hand.\n\nParallel execution\n------------------\n\nThe k-modes and k-prototypes implementations both offer support for\nmultiprocessing via the \n`joblib library `_,\nsimilar to e.g.\u00a0scikit-learn's implementation of k-means, using the\n:code:`n_jobs` parameter. It generally does not make sense to set more jobs\nthan there are processor cores available on your system.\n\nThis potentially speeds up any execution with more than one initialization try,\n:code:`n_init > 1`, which may be helpful to reduce the execution time for\nlarger problems. Note that it depends on your problem whether multiprocessing\nactually helps, so be sure to try that out first. You can check out the\nexamples for some benchmarks.\n\nFAQ\n---\n\nQ: I'm seeing errors such as :code:`TypeError: '<' not supported between instances of 'str' and 'float'`\nwhen using the :code:`kprototypes` algorithm.\n\nA: One or more of your numerical feature columns have string values in them. Make sure that all \ncolumns have consistent data types.\n\nReferences\n----------\n\n.. [HUANG97] Huang, Z.: Clustering large data sets with mixed numeric and\n categorical values, Proceedings of the First Pacific Asia Knowledge\n Discovery and Data Mining Conference, Singapore, pp. 21-34, 1997.\n\n.. [HUANG98] Huang, Z.: Extensions to the k-modes algorithm for clustering\n large data sets with categorical values, Data Mining and Knowledge\n Discovery 2(3), pp. 283-304, 1998.\n\n.. [CAO09] Cao, F., Liang, J, Bai, L.: A new initialization method for\n categorical data clustering, Expert Systems with Applications 36(7),\n pp. 10223-10228., 2009.\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/nicodv/kmodes", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "kmodes", "package_url": "https://pypi.org/project/kmodes/", "platform": "", "project_url": "https://pypi.org/project/kmodes/", "project_urls": { "Homepage": "https://github.com/nicodv/kmodes" }, "release_url": "https://pypi.org/project/kmodes/0.10.1/", "requires_dist": [ "numpy (>=1.10.4)", "scikit-learn (>=0.19.0)", "scipy (>=0.13.3)", "joblib (>=0.11)" ], "requires_python": "", "summary": "Python implementations of the k-modes and k-prototypes clustering algorithms for clustering categorical data.", "version": "0.10.1" }, "last_serial": 5178247, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "54ce7de30b11661986afdbe7abd8cea2", "sha256": "8a80cd81be09418c3e960707b3dae53bb0567988392dae67dd8f30bf6ec841fa" }, "downloads": -1, "filename": "kmodes-0.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "54ce7de30b11661986afdbe7abd8cea2", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 13185, "upload_time": "2016-02-28T02:53:57", "url": "https://files.pythonhosted.org/packages/19/b1/645c6d6087413e728cb03a1eece82f4a4f6695341dc7882bdedc2fbfb174/kmodes-0.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0807067de37d4ec1eb0ec26a2fa4bc26", "sha256": "4b287d3d1d597cbeba225eeb50698194a82d45059be9c69b1b9240bfc9cda7e9" }, "downloads": -1, "filename": "kmodes-0.1.tar.gz", "has_sig": false, "md5_digest": "0807067de37d4ec1eb0ec26a2fa4bc26", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9179, "upload_time": "2016-02-28T02:54:10", "url": "https://files.pythonhosted.org/packages/74/b9/e3d2169a69fdfb610db4e15b510425581fae712c2d61ef340be41cf9307f/kmodes-0.1.tar.gz" } ], "0.10.0": [ { "comment_text": "", "digests": { "md5": "bf53baacfef28488bfee32a5e0da1f5e", "sha256": "6b58ba09654a7c5919caf93e9190ba6bfa3208edd30fe0f7a4e185ab363043a0" }, "downloads": -1, "filename": "kmodes-0.10.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "bf53baacfef28488bfee32a5e0da1f5e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 17333, "upload_time": "2019-04-04T17:23:24", "url": "https://files.pythonhosted.org/packages/56/23/568339f89723c07b349440c18fda932bb20d79923b141e4c259a5d469c5c/kmodes-0.10.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ad5d3ea5ef36b67da06437b8dd0273be", "sha256": "501cc6a49c0e92dda4dfe94bbe0a5cea1dd6a25e3e3f576026ec94c4f618812a" }, "downloads": -1, "filename": "kmodes-0.10.0.tar.gz", "has_sig": false, "md5_digest": "ad5d3ea5ef36b67da06437b8dd0273be", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13593, "upload_time": "2019-04-04T17:23:26", "url": "https://files.pythonhosted.org/packages/9b/5b/3eae93487063c997f2ae6c4fa25f274af11550215fb680266d3ab7dd343c/kmodes-0.10.0.tar.gz" } ], "0.10.1": [ { "comment_text": "", "digests": { "md5": "14a3b381013303ccd33e539444ccba46", "sha256": "bce1108382bffc09902c2fe5e1acb1cbb10736634efce2af88f195b4998f7c5e" }, "downloads": -1, "filename": "kmodes-0.10.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "14a3b381013303ccd33e539444ccba46", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 17834, "upload_time": "2019-04-23T16:39:48", "url": "https://files.pythonhosted.org/packages/79/c0/f7d8a0eb41ac6f302b4bc100f91b6e0f2558425ccfefaa0ec0430f77ee97/kmodes-0.10.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0d8be7c0db5a99df92a1678532e87bd2", "sha256": "3222c2f672a6356be353955c02d1e38472d9f6074744b4ffb0c058e8c2235ea1" }, "downloads": -1, "filename": "kmodes-0.10.1.tar.gz", "has_sig": false, "md5_digest": "0d8be7c0db5a99df92a1678532e87bd2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13943, "upload_time": "2019-04-23T16:39:50", "url": "https://files.pythonhosted.org/packages/1f/d2/30cf21fe6f108e11fcf3f3a26b265756c3769fbbd5ea2079319ee293652c/kmodes-0.10.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "e8ded61ca3c07b36ded7ba2c2849f832", "sha256": "0f7cfc7a5e1901e88623cfa38ce916f39354a7a2589184310cb9b320763d07a2" }, "downloads": -1, "filename": "kmodes-0.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e8ded61ca3c07b36ded7ba2c2849f832", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 13422, "upload_time": "2016-03-02T20:50:03", "url": "https://files.pythonhosted.org/packages/22/c7/ce0be8fbfcb413dac15fc914e4491d5ee379d93b2cee9eed1eda7271928b/kmodes-0.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6c05344efaac7f39df7e4d4ed57061f5", "sha256": "7d5ca27fd24fd251fe38a10d07102792593a39728e080f5db649806533fd418a" }, "downloads": -1, "filename": "kmodes-0.2.tar.gz", "has_sig": false, "md5_digest": "6c05344efaac7f39df7e4d4ed57061f5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9424, "upload_time": "2016-03-02T20:50:30", "url": "https://files.pythonhosted.org/packages/7d/ce/550fda8c97d08d5ac38099eee0f506332edfb2e4e5b5db4515e4deec77aa/kmodes-0.2.tar.gz" } ], "0.4": [ { "comment_text": "", "digests": { "md5": "4dd7330df5ae1c42c706fef8da0dfbda", "sha256": "6bd6b20a33fd7f4d320235e55b2bac72947c4a76921f8642a1f0fc6422379bb4" }, "downloads": -1, "filename": "kmodes-0.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "4dd7330df5ae1c42c706fef8da0dfbda", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 13788, "upload_time": "2016-03-14T21:15:47", "url": "https://files.pythonhosted.org/packages/d7/4f/3e6b4a538c16f607c582017a8efa8c39d9d44d599b2a0b97090be91e3a62/kmodes-0.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b373c0670cc093aadfba68067b3e18c9", "sha256": "0602dcc585c8f650cc732dba644f1e8334fe1107f75edcde9282e8cda860500e" }, "downloads": -1, "filename": "kmodes-0.4.tar.gz", "has_sig": false, "md5_digest": "b373c0670cc093aadfba68067b3e18c9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9837, "upload_time": "2016-03-14T21:15:55", "url": "https://files.pythonhosted.org/packages/65/bc/9beac91d3d997bc85f238e977ddb8c50b3dfaca129e5c689de057c451e4f/kmodes-0.4.tar.gz" } ], "0.5": [ { "comment_text": "", "digests": { "md5": "be11e6f128add1eb99b90a06fdb0f79e", "sha256": "4c5166878acc5142062108f8bdd483ddc518cd9a8432dfa414c85524947fddd7" }, "downloads": -1, "filename": "kmodes-0.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "be11e6f128add1eb99b90a06fdb0f79e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 14309, "upload_time": "2016-05-27T19:48:48", "url": "https://files.pythonhosted.org/packages/53/ea/616ed969206d24159486687745ba48aacde6b35cc414d77ffa6708718ae1/kmodes-0.5-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "71a3ca831a14b8ff23725efe61f74843", "sha256": "fa1796c57dcb015f393528c969aa0fd040c42211a904cdb8c1a0941e4a4c5869" }, "downloads": -1, "filename": "kmodes-0.5.tar.gz", "has_sig": false, "md5_digest": "71a3ca831a14b8ff23725efe61f74843", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10345, "upload_time": "2016-05-27T19:48:55", "url": "https://files.pythonhosted.org/packages/36/04/366f7f5b3674a41f06a99edfe4cfcf620f67e1b68180e454fef0fc0635d0/kmodes-0.5.tar.gz" } ], "0.6": [ { "comment_text": "", "digests": { "md5": "877e276481bdc2d28d71e92525aea52d", "sha256": "d5ba99bd39b1452e81de551cdfca689c05fe2b8db927831e1c9609c1353d3e80" }, "downloads": -1, "filename": "kmodes-0.6-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "877e276481bdc2d28d71e92525aea52d", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 16005, "upload_time": "2016-06-08T15:22:46", "url": "https://files.pythonhosted.org/packages/cb/d1/a025c0ef91af63f77b2e44b7bce0c1380c752a94aa42aa2fc9715b4d7ad0/kmodes-0.6-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f1ed8ca10caddaa18c0bcd4d3c52b76b", "sha256": "14eab4ae818e143177b4e195d45ef260a6789f2c9f889b1a1e07cbf108e78be4" }, "downloads": -1, "filename": "kmodes-0.6.tar.gz", "has_sig": false, "md5_digest": "f1ed8ca10caddaa18c0bcd4d3c52b76b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11440, "upload_time": "2016-06-08T15:22:50", "url": "https://files.pythonhosted.org/packages/44/eb/0b07bb150b8b725bfe997b4a1199c095191a1f433da1b0a10b04e2be497a/kmodes-0.6.tar.gz" } ], "0.7": [ { "comment_text": "", "digests": { "md5": "7c12cabb8a1334a0434d04fa3bd5675c", "sha256": "75ed3b73ac1fd3549c8dabc1698f0dab2e36ccc6bd20210ed6e77e65b033c34d" }, "downloads": -1, "filename": "kmodes-0.7-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "7c12cabb8a1334a0434d04fa3bd5675c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 17838, "upload_time": "2017-03-30T17:42:16", "url": "https://files.pythonhosted.org/packages/c8/59/c1e993d211926609c1345ea030505f378acc33aff9e6e8d4a57ffbaf09e0/kmodes-0.7-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5353ccf57dbd8bbee79131c9412affcf", "sha256": "2c3f86028e5bff5213a5ac77544c79a1eb59389de59c410a22f5ac0501e85bec" }, "downloads": -1, "filename": "kmodes-0.7.tar.gz", "has_sig": false, "md5_digest": "5353ccf57dbd8bbee79131c9412affcf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11435, "upload_time": "2017-03-30T17:42:18", "url": "https://files.pythonhosted.org/packages/09/24/a3c870c74f5ef511f8c4a63b03cdb933d2818f1cf37cfc2aa8ef85c9985c/kmodes-0.7.tar.gz" } ], "0.8": [ { "comment_text": "", "digests": { "md5": "e45784894e5b38cf6702c2f9f2e014a1", "sha256": "83ff6db3e142c3b92f00445f6aa1405031587a0eb704bafcc9a525e21bfa4646" }, "downloads": -1, "filename": "kmodes-0.8-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e45784894e5b38cf6702c2f9f2e014a1", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 18818, "upload_time": "2017-11-15T20:12:21", "url": "https://files.pythonhosted.org/packages/cd/e1/6c0c370093da91207b7a4d5547fc2dc1c0219b492002a3519234768bcecd/kmodes-0.8-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d7f3ee8d56b06e8aff185d791d657ddb", "sha256": "5c4a2bc035b6a2bba824bd697cdddd31bc626327d7514da84fc4cdc2eb6ce601" }, "downloads": -1, "filename": "kmodes-0.8.tar.gz", "has_sig": false, "md5_digest": "d7f3ee8d56b06e8aff185d791d657ddb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12548, "upload_time": "2017-11-15T20:12:22", "url": "https://files.pythonhosted.org/packages/54/70/b9449c133353e4ccb8bed0a5e560d086437f597e852324e53d410636a67d/kmodes-0.8.tar.gz" } ], "0.9": [ { "comment_text": "", "digests": { "md5": "e45bc62b1ab97a0c8d8a22c321c82a80", "sha256": "29e101445feacd9b62c1eb4e54260e8c7db983c9bf9dbffcc865611ab356aff0" }, "downloads": -1, "filename": "kmodes-0.9-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e45bc62b1ab97a0c8d8a22c321c82a80", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 15741, "upload_time": "2018-05-02T22:46:49", "url": "https://files.pythonhosted.org/packages/1a/d5/54e0efa2ddf33234761f8c35471f4a2280bb4cef44fd39fa08f1e663946a/kmodes-0.9-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b3621b47a3b842ba7dd52f6d0f0161e3", "sha256": "c90f2cc3ebeb4d9f81f689250537c0de8825b54fb6887579d3f27c349a7807cd" }, "downloads": -1, "filename": "kmodes-0.9.tar.gz", "has_sig": false, "md5_digest": "b3621b47a3b842ba7dd52f6d0f0161e3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12418, "upload_time": "2018-05-02T22:46:51", "url": "https://files.pythonhosted.org/packages/1e/f3/59a84cc0edf9bd542ed334833af94edae91e391397652ed84938afb800d7/kmodes-0.9.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "14a3b381013303ccd33e539444ccba46", "sha256": "bce1108382bffc09902c2fe5e1acb1cbb10736634efce2af88f195b4998f7c5e" }, "downloads": -1, "filename": "kmodes-0.10.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "14a3b381013303ccd33e539444ccba46", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 17834, "upload_time": "2019-04-23T16:39:48", "url": "https://files.pythonhosted.org/packages/79/c0/f7d8a0eb41ac6f302b4bc100f91b6e0f2558425ccfefaa0ec0430f77ee97/kmodes-0.10.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0d8be7c0db5a99df92a1678532e87bd2", "sha256": "3222c2f672a6356be353955c02d1e38472d9f6074744b4ffb0c058e8c2235ea1" }, "downloads": -1, "filename": "kmodes-0.10.1.tar.gz", "has_sig": false, "md5_digest": "0d8be7c0db5a99df92a1678532e87bd2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13943, "upload_time": "2019-04-23T16:39:50", "url": "https://files.pythonhosted.org/packages/1f/d2/30cf21fe6f108e11fcf3f3a26b265756c3769fbbd5ea2079319ee293652c/kmodes-0.10.1.tar.gz" } ] }