{ "info": { "author": "Benjamin Paassen", "author_email": "bpaassen@techfak.uni-bielefeld.de", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Prototype-based Machine Learning on Distance Data\n\nCopyright (C) 2019 - Benjamin Paassen \nMachine Learning Research Group \nCenter of Excellence Cognitive Interaction Technology (CITEC) \nBielefeld University\n\nThis program is free software; you can redistribute it and/or modify\nit under the terms of the GNU General Public License as published by\nthe Free Software Foundation; either version 3 of the License, or\n(at your option) any later version.\n\nThis program is distributed in the hope that it will be useful,\nbut WITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\nGNU General Public License for more details.\n\nYou should have received a copy of the GNU General Public License\nalong with this program; if not, see .\n\n## Introduction\n\nThis [scikit-learn][scikit] compatible, Python3 library provides several algorithms\nto learn prototype models on distance data. At this time, this library features\nthe following algorithms:\n\n* Relational Neural Gas ([Hammer and Hasenfuss, 2007][Ham2007]) for clustering,\n* Relational Generalized Learning Vector Quantization ([Hammer, Hofmann, Schleif, and Zhu, 2014][Ham2014]) for classification, and\n* Median Generalized Learning Vector Quantization ([Nebel, Hammer, Frohberg, and Villmann, 2015][Neb2015]) for classification.\n\nRefer to the Quickstart Guide for a note on how to use these models and\nrefer to the Background section for more details on the algorithms.\n\nNote that this library follows the \n\nIf you intend to use this library in academic work, please cite the respective\nreference paper.\n\n## Installation\n\nThis package is available on `pypi` as `proto_dist_ml`. You can install\nit via\n\n```\npip install proto_dist_ml\n```\n\n## QuickStart Guide\n\nFor an example we recommend to take a look at the demo in the notebook\n`demo.ipynb`. In general, all models in this library follow the [scikit-learn][scikit]\nconvention, i.e. you need to perform the following steps:\n\n1. Instantiate your model, e.g. via `model = proto_dist_ml.rng.RNG(K)` where\n `K` is the number of prototypes.\n2. Fit your model to training data, e.g. via `model.fit(D)`, where `D` is the\n matrix of pairwise distances between your training data points.\n3. Perform a prediction for test data, e.g. via `model.predict(D)`, where `D`\n is the matrix of distances from the test to the training data points.\n\n## Background\n\nThe basic idea of prototype models is that we can cluster and\nclassify data by assigning them to the cluster/class of the closest _prototype_,\nwhere a prototype is a data point that represents the cluster/class well.\n\nIn case of distance data, we can not express a prototype in vectorial form but\ninstead need to use an indirect form, namely a convex combination of existing\ndata points. In other words, our $`k`$th prototype $`w_k`$ is defined as\n\n```math\n\\vec w_k = \\sum_{i=1}^m \\alpha_{k, i} \\cdot \\vec x_i\n\\qquad \\text{where } \\sum_{i=1}^m \\alpha_{k, i} = 1\n\\text{ and } \\alpha_{k, i} \\geq 0 \\quad \\forall i\n```\n\nwhere $`\\vec x_1, \\ldots, \\vec x_m`$ are the training data points and where\n$`\\alpha_{k, 1}, \\ldots, \\alpha_{k, m}`$ are the convex coefficiants\nrepresenting prototype $`w_k`$. Because the prototype is fully specified by\nthe data and the convex coefficients, we do not need any explicit form for\n$`w_k`$ anymore.\n\nTo cluster/classify new data, we now need to determine the distance between a\ndata point $`x`$ and a prototpe $`w_k`$. As it turns out, this distance can\nalso be expressed solely in terms of the convex coefficients and the\ndata-to-data distances. In particular, we obtain:\n\n```math\nd(x, w_k)^2 = \\sum_{i=1}^m \\alpha_{k, i} \\cdot d(x, x_i)^2\n- \\frac{1}{2} \\sum_{i=1}^m \\sum_{j=1}^m \\alpha_{k, i} \\cdot \\alpha_{k, j} \\cdot d(x_i, x_j)^2\n```\n\nIn matrix-vector notation we obtain:\n\n```math\nd(x, w_k)^2 = {\\vec \\alpha_k}^T \\cdot \\vec d^2\n- \\frac{1}{2} {\\vec \\alpha_k}^T \\cdot D^2 \\cdot \\vec \\alpha_k\n```\n\nwhere $`\\vec d^2`$ the vector of distances between $`x`$ and all training\ndata points $`x_i`$ and where $`D^2`$ is the distance matrix between the\ntraining data points.\n\nThe main challenge for distance-based prototype learning is now to optimize\nthe coefficients $`\\alpha_{k, i}`$ according to some meaningful loss function.\nThe loss function and its optimization differs between the algorithms. In more\ndetail, we take the following approaches.\n\n### Relational Neural Gas\n\nRelational neural gas (RNG; [Hammer and Hasenfuss, 2007][Ham2007]) is a\nclustering approach that tries to optimize the loss function\n\n```math\n\\sum_{i=1}^m \\sum_{k=1}^K h_{i, k} \\cdot d(x_i, w_k)^2\n```\n\nwhere $`h_{i, k}`$ quantifies how responsible prototype $`w_k`$ is for\ndata point $`x_i`$. This term is calculated as follows:\n\n```math\nh_{i, k} = \\exp(-r_{i, k} / \\lambda) \\qquad \\text{where } r_{i, k} = |\\{ l | d(x_i, w_l) < d(x_i, w_k) \\}|\n```\n\nIn other words, $`w_k`$ is the $`r_{i, k}`$-closest prototype to data point\n$`x_i`$ and the lower ranked a prototype is (i.e. the closer it is), the higher\nis its responsibility for the data point. $`\\lambda`$ is a scaling factor that\nexpresses how many prototypes are still considered. Per default, we start with\n$`\\lambda = K / 2`$ and then anneal $`\\lambda`$ until $`\\lambda = 0.01`$, i.e.\nonly the closest prototype is considered. At that point, the loss above is\nequivalent to the $`K`$-means loss.\n\nGiven the current values for $h_{i, k}$, optimizing the convex coefficients\n$`\\alpha_{k, i}`$ is possible in closed form. In particular, we obtain:\n$`\\alpha_{k, i} = h_{i, k} / \\sum_j h_{j, k}`$. The RNG\ntraining procedure thus consists of three steps which are iterated in each\ntraining epoch:\n\n1. Compute the responsibilities $`h_{i, k}`$.\n2. Compute the new convex coefficients $`\\alpha_{k, i}`$.\n3. Decrease $`\\lambda`$.\n\n### Relational Generalized Learning Vector Quantization\n\nRelational generalized learning vector quantization (RGLVQ; [Hammer, Hofmann, Schleif, and Zhu, 2014][Ham2014])\nis a classification approach which aims at optimizing the generalized learning\nvector quantization loss:\n\n```math\n\\sum_{i=1}^m \\Phi\\Big(\\frac{d_i^+ - d_i^-}{d_i^+ + d_i^-}\\Big)\n```\n\nwhere $`d_i^+`$ is the distance of data point $`x_i`$ to the closest prototype\nwith the same label, $`d_i^-`$ is the distance of data point $`x_i`$ to the\nclosest prototype with a different label, and $`\\Phi`$ is a squashing\nfunction (such as tanh or the logistic function).\nNote that data point $`x_i`$ is correctly classified if and only\nif $`d_i^+ - d_i^- < 0`$. As such, the GLVQ loss can be regarded as a soft\napproximation of the classification error.\n\nNote that this loss has the drawback that distances need to be strictly\npositive in order to guarantee a nonzero denominator. This excludes\nnon-Euclidean distances (i.e. distances which do not correspond to an inner\nproduct) because these may imply negative data-to-prototype distances.\n\nWe optimize this loss via L-BFGS, restricting the coefficients to be convex\nin each step. The gradient follows directly from the\nformula above and the distance formula above. For more details, refer to\n([Hammer et al., 2014][Ham2014]).\n\n### Median Generalized Learning Vector Quantization\n\nMedian generalized learning vector quantization\n(MGLVQ; [Nebel, Hammer, Frohberg, and Villmann, 2015][Neb2015]) is a variant\nof GLVQ that restricts prototypes to be strictly data points, i.e. for each\nprototype $`w_k`$ there exists exactly one $`i`$, such that $`\\alpha_{k, i} = 1`$\nand every other coefficient is zero. This has two key advantages. First, it\npermits non-Euclidean and even asymmetric distances because we do not rely on\nan interpolation between data points. Second, it is more efficient during\nclassification because we can compute the distances to the prototypes directly\nand do not need to use the relational distance formula above.\n\nHowever, MGLVQ is also more challenging to train because we can not perform a\nsmooth gradient method but instead must apply a discrete optimization scheme.\nIn this toolbox, we optimize the GLVQ loss (see above) via greedy hill climbing,\ni.e. we try to find any prototype-datapoint combination that would reduce the\nloss and apply the first such combination we find until no such move exists\nanymore.\n\n## Contents\n\nThis library contains the following files.\n\n* `demo.ipynb` : A demo script illustrating how to use this library.\n* `LICENSE.md` : A copy of the GPLv3 license.\n* `mglvq_test.py` : A set of unit tests for `mglvq.py`.\n* `proto_dist_ml/mglvq.py` : An implementation of median generalized\n learning vector quantization.\n* `proto_dist_ml/rglvq.py` : An implementation of relational generalized\n learning vector quantization.\n* `proto_dist_ml/rng.py` : An implementation of relational neural gas.\n* `README.md` : This file.\n* `rglvq_test.py` : A set of unit tests for `rglvq.py`.\n* `rng_test.py` : A set of unit tests for `rng.py`.\n\n## Licensing\n\nThis library is licensed under the [GNU General Public License Version 3][GPLv3].\n\n## Dependencies\n\nThis library depends on [NumPy][np] for matrix operations, on [scikit-learn][scikit]\nfor the base interfaces and on [SciPy][scipy] for optimization.\n\n## Literature\n\n* Hammer, B. & Hasenfuss, A. (2007). Relational Neural Gas. Proceedings of the\n 30th Annual German Conference on AI (KI 2007), 190-204. doi:[10.1007/978-3-540-74565-5_16](https://doi.org/10.1007/978-3-540-74565-5_16). [Link][Ham2007]\n* Hammer, B., Hofmann, D., Schleif, F., & Zhu, X. (2014). Learning vector\n quantization for (dis-)similarities. Neurocomputing, 131, 43-51.\n doi:[10.1016/j.neucom.2013.05.054](https://doi.org/10.1016/j.neucom.2013.05.054). [Link][Ham2014]\n* Nebel, D., Hammer, B., Frohberg, K., & Villmann, T. (2015). Median variants\n of learning vector quantization for learning of dissimilarity data.\n Neurocomputing, 169, 295-305. doi:[10.1016/j.neucom.2014.12.096][Neb2015]\n\n\n\n[scikit]: https://scikit-learn.org/stable/ \"Scikit-learn homepage\"\n[np]: http://numpy.org/ \"Numpy homepage\"\n[scipy]: https://scipy.org/ \"SciPy homepage\"\n[GPLv3]: https://www.gnu.org/licenses/gpl-3.0.en.html \"The GNU General Public License Version 3\"\n[Ham2007]:https://www.researchgate.net/publication/221562215_Relational_Neural_Gas \"Hammer, B. & Hasenfuss, A. (2007). Relational Neural Gas. Proceedings of the 30th Annual German Conference on AI (KI 2007), 190-204. doi:10.1007/978-3-540-74565-5_16\"\n[Ham2014]:http://www.techfak.uni-bielefeld.de/~fschleif/pdf/nc_diss_2014.pdf \"Hammer, B., Hofmann, D., Schleif, F., & Zhu, X. (2014). Learning vector quantization for (dis-)similarities. Neurocomputing, 131, 43-51. doi:10.1016/j.neucom.2013.05.054\"\n[Neb2015]:https://doi.org/10.1016/j.neucom.2014.12.096 \"Nebel, D., Hammer, B., Frohberg, K., & Villmann, T. (2015). Median variants of learning vector quantization for learning of dissimilarity data. Neurocomputing, 169, 295-305. doi:10.1016/j.neucom.2014.12.096\"\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://gitlab.ub.uni-bielefeld.de/bpaassen/proto-dist-ml", "keywords": "relational-neural-gas learning-vector-quantization lvq clustering classification machine-learning distances", "license": "", "maintainer": "", "maintainer_email": "", "name": "proto-dist-ml", "package_url": "https://pypi.org/project/proto-dist-ml/", "platform": "", "project_url": "https://pypi.org/project/proto-dist-ml/", "project_urls": { "Homepage": "https://gitlab.ub.uni-bielefeld.de/bpaassen/proto-dist-ml" }, "release_url": "https://pypi.org/project/proto-dist-ml/1.0.1/", "requires_dist": [ "numpy", "scikit-learn", "scipy" ], "requires_python": "", "summary": "Prototype-based Machine Learning on Distance Data.", "version": "1.0.1" }, "last_serial": 5724553, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "d97e64ffed32ae6a0847e7118732a83d", "sha256": "68f5ccce5c17c1eb45613385f50633cdeffb812d81afbd1ac6a281af32b4fc3f" }, "downloads": -1, "filename": "proto_dist_ml-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "d97e64ffed32ae6a0847e7118732a83d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 30518, "upload_time": "2019-08-20T18:35:47", "url": "https://files.pythonhosted.org/packages/c3/aa/d383321766b88310d19ea9c12c1c53adb7a03b1a3103552a53cb80208267/proto_dist_ml-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "917f7a112643912dcfbe1fb9b8ab8346", "sha256": "b37ceefba452d968a3314a5da043845bad5d0d0d148faa92bf051000753407a6" }, "downloads": -1, "filename": "proto_dist_ml-1.0.0.tar.gz", "has_sig": false, "md5_digest": "917f7a112643912dcfbe1fb9b8ab8346", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18873, "upload_time": "2019-08-20T18:35:49", "url": "https://files.pythonhosted.org/packages/cd/4a/966881a1ff3f32c54deab838470f4e7894a5cccada511ad7f5e58638bc8e/proto_dist_ml-1.0.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "cd3b98b0add015f0520ee1e186f69b40", "sha256": "865e6230ea16be6f817876de4c36181e87f087b68872bca5e7d0c165f2e94f1e" }, "downloads": -1, "filename": "proto_dist_ml-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "cd3b98b0add015f0520ee1e186f69b40", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 30520, "upload_time": "2019-08-24T14:26:32", "url": "https://files.pythonhosted.org/packages/98/a7/1fb559b58caf104dda5ade55500c9e434103278a2f0f8237fd7b18ca59ef/proto_dist_ml-1.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "815679a50b9ec396b9c36ab5f20eb0ce", "sha256": "323345e263e427307cca22b7e8574c49c05712beeafd78256362bb4efe657c46" }, "downloads": -1, "filename": "proto_dist_ml-1.0.1.tar.gz", "has_sig": false, "md5_digest": "815679a50b9ec396b9c36ab5f20eb0ce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18874, "upload_time": "2019-08-24T14:26:34", "url": "https://files.pythonhosted.org/packages/a7/cb/bfce0087fe044838ff1c6d7730fba2865b9fe63a6ca7a507e4787dd262a4/proto_dist_ml-1.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "cd3b98b0add015f0520ee1e186f69b40", "sha256": "865e6230ea16be6f817876de4c36181e87f087b68872bca5e7d0c165f2e94f1e" }, "downloads": -1, "filename": "proto_dist_ml-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "cd3b98b0add015f0520ee1e186f69b40", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 30520, "upload_time": "2019-08-24T14:26:32", "url": "https://files.pythonhosted.org/packages/98/a7/1fb559b58caf104dda5ade55500c9e434103278a2f0f8237fd7b18ca59ef/proto_dist_ml-1.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "815679a50b9ec396b9c36ab5f20eb0ce", "sha256": "323345e263e427307cca22b7e8574c49c05712beeafd78256362bb4efe657c46" }, "downloads": -1, "filename": "proto_dist_ml-1.0.1.tar.gz", "has_sig": false, "md5_digest": "815679a50b9ec396b9c36ab5f20eb0ce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18874, "upload_time": "2019-08-24T14:26:34", "url": "https://files.pythonhosted.org/packages/a7/cb/bfce0087fe044838ff1c6d7730fba2865b9fe63a6ca7a507e4787dd262a4/proto_dist_ml-1.0.1.tar.gz" } ] }