{ "info": { "author": "Teodor Mihai Moldovan", "author_email": "moldovan@cs.berkeley.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "Topic :: Scientific/Engineering :: Artificial Intelligence" ], "description": "Description\n===========\n\ndpcluster is a package for grouping together (clustering) vectors. It automatically chooses the number of clusters that fits the data best. Specifically, it models the data as a Dirichlet Process mixture in the exponential family. For a tutorial see `\"Dirichlet Process\" by Y.W. Teh (2010) `_. Currently the only distribution implemented is the multivariate Gaussian with a Normal-Inverse-Wishart conjugate prior but extensions to other distributions are possible. \n\nTwo inference algorithms are implemented:\n\n* Variational inference as described in `\"Variational Inference for Dirichlet Process Mixtures\" by Blei et al. (2006) `_. This is a batch algorithm that requires storing all data in memory.\n* An experimental on-line inference algorithm that requires only O(log(n)) memory where n is the total number of observations.\n\nTo install locally run::\n\n python setup.py install --user\n\nUsage\n=====\n\nHere is a simple example to demonstrate clustering a number of random points in the plane::\n\n >>> from dpcluster import *\n >>> n = 10\n >>> data = np.random.normal(size=2*n).reshape(-1,2)\n >>> vdp = VDP(GaussianNIW(2))\n >>> vdp.batch_learn(vdp.distr.sufficient_stats(data))\n >>> plt.scatter(data[:,0],data[:,1])\n >>> vdp.plot_clusters(slc=np.array([0,1]))\n >>> plt.show()\n\nRunning this might produce 2-3 clusters depending on the randomly generated data. The adaptive nature of the Dirichlet Process mixture model becomes apparent when we increase the number of data points from ``n = 10`` to ``n = 500``. In this case the clustering algorithm will likely explain the data using only one cluster.\n\nToDo\n====\n\n* Implement more clustering algorithms e.g. based on Gibbs sampling, expectation propagation, stochastic gradient descent.\n* Implement more clustering distributions.\n* Re-implement algorithms to take advantage of multi-core or GPU computing.", "description_content_type": null, "docs_url": null, "download_url": "http://github.com/teodor-moldovan/dpcluster", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://dpcluster.readthedocs.org/", "keywords": null, "license": "UNKNOWN", "maintainer": null, "maintainer_email": null, "name": "dpcluster", "package_url": "https://pypi.org/project/dpcluster/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/dpcluster/", "project_urls": { "Download": "http://github.com/teodor-moldovan/dpcluster", "Homepage": "http://dpcluster.readthedocs.org/" }, "release_url": "https://pypi.org/project/dpcluster/0.104/", "requires_dist": null, "requires_python": null, "summary": "dpcluster is a package for grouping together (clustering) vectors. It automatically chooses the number of clusters that fits the data best based on the underlying Dirichlet Process mixture model.", "version": "0.104" }, "last_serial": 776479, "releases": { "0.1": [ { "comment_text": "built for Linux-3.8.0-19-generic-x86_64-with-glibc2.7", "digests": { "md5": "74a81630cc94a1573f2c6ac81e8eac43", "sha256": "2f54ea478ffbe76fa39ae3e220ec7139414a00c3757bd0b306690ef1700dba68" }, "downloads": -1, "filename": "dpcluster-0.1.linux-x86_64.tar.gz", "has_sig": false, "md5_digest": "74a81630cc94a1573f2c6ac81e8eac43", "packagetype": "bdist_dumb", "python_version": "any", "requires_python": null, "size": 21736, "upload_time": "2013-04-30T20:53:35", "url": "https://files.pythonhosted.org/packages/f1/93/6babad2fc4207fe1b87900b81ff30e361b899452931897f0035922135dee/dpcluster-0.1.linux-x86_64.tar.gz" }, { "comment_text": "", "digests": { "md5": "78a3e8732fc578ea09fb2a2261cb7d24", "sha256": "9f5f6e6f83cc930a33fa8108e74bee5f0960ddeeb32bc2d06256bf434c179934" }, "downloads": -1, "filename": "dpcluster-0.1.tar.gz", "has_sig": false, "md5_digest": "78a3e8732fc578ea09fb2a2261cb7d24", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10216, "upload_time": "2013-04-30T20:54:55", "url": "https://files.pythonhosted.org/packages/f0/bb/dfb7a0dc245dfdd4a392573fbfc855a30405358d856d38214674df3979f1/dpcluster-0.1.tar.gz" } ], "0.101": [ { "comment_text": "built for Linux-3.8.0-19-generic-x86_64-with-glibc2.7", "digests": { "md5": "9018fd4ae8d4328292734248af1ad68f", "sha256": "904363b0453da864b71b8ebe3f4a8018b19737b0d5eade68478222f13e3231df" }, "downloads": -1, "filename": "dpcluster-0.101.linux-x86_64.tar.gz", "has_sig": false, "md5_digest": "9018fd4ae8d4328292734248af1ad68f", "packagetype": "bdist_dumb", "python_version": "any", "requires_python": null, "size": 21741, "upload_time": "2013-04-30T20:57:49", "url": "https://files.pythonhosted.org/packages/7c/c4/a7894ac8a2634585adbea7337090f38f7289218744cae852332fe12a596f/dpcluster-0.101.linux-x86_64.tar.gz" }, { "comment_text": "", "digests": { "md5": "171bf661cbf32c643278ad8e5b3cf740", "sha256": "b2a1e99a27a8cb79c982ce5254d440af846ff7e475b63471239468bd35d31fee" }, "downloads": -1, "filename": "dpcluster-0.101.tar.gz", "has_sig": false, "md5_digest": "171bf661cbf32c643278ad8e5b3cf740", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10226, "upload_time": "2013-04-30T20:57:47", "url": "https://files.pythonhosted.org/packages/ac/81/020549a8453207180ce89b4f57cc69729a405d1e0b0bb15d7e331a129d75/dpcluster-0.101.tar.gz" } ], "0.102": [ { "comment_text": "built for Linux-3.8.0-19-generic-x86_64-with-glibc2.7", "digests": { "md5": "7fa18a8a3c6c1bf085686ea4438810a0", "sha256": "18620fa0f8ec32412d16e8fb053d937a41d71a0df34c5e3a587812b7da7b9783" }, "downloads": -1, "filename": "dpcluster-0.102.linux-x86_64.tar.gz", "has_sig": false, "md5_digest": "7fa18a8a3c6c1bf085686ea4438810a0", "packagetype": "bdist_dumb", "python_version": "any", "requires_python": null, "size": 21748, "upload_time": "2013-04-30T20:58:58", "url": "https://files.pythonhosted.org/packages/02/51/d56adc44172824d6cc1d1fce8f43856606e4e3514fe6fce37b70153fe9b7/dpcluster-0.102.linux-x86_64.tar.gz" }, { "comment_text": "", "digests": { "md5": "c7eda5908057404542735bf10cdd06d7", "sha256": "2a24ca53bea14d055608bb817c29554f3aa0e8afbfb10a7b7baea2859e4cb8c2" }, "downloads": -1, "filename": "dpcluster-0.102.tar.gz", "has_sig": false, "md5_digest": "c7eda5908057404542735bf10cdd06d7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10233, "upload_time": "2013-04-30T20:58:55", "url": "https://files.pythonhosted.org/packages/1d/dd/1ccd91e3e1344245dd39f46649b5565db6c4143899c7335b7010232a63e8/dpcluster-0.102.tar.gz" } ], "0.103": [ { "comment_text": "built for Linux-3.8.0-19-generic-x86_64-with-glibc2.7", "digests": { "md5": "92a2eb7f8d461769ba11b6a5052a0f33", "sha256": "ae0607fc99f91a230933fcb9ce69d2348e5f9b1fc879b1b7c3c8224f7158f68a" }, "downloads": -1, "filename": "dpcluster-0.103.linux-x86_64.tar.gz", "has_sig": false, "md5_digest": "92a2eb7f8d461769ba11b6a5052a0f33", "packagetype": "bdist_dumb", "python_version": "any", "requires_python": null, "size": 21748, "upload_time": "2013-05-02T18:35:11", "url": "https://files.pythonhosted.org/packages/8b/78/5205f7ce53ba0c8749f75809273982fb81421a506e14c1300dc63a6c620a/dpcluster-0.103.linux-x86_64.tar.gz" }, { "comment_text": "", "digests": { "md5": "d167dedfe7e0f30fb93ea0c58f16a66a", "sha256": "9b526debe9c548a5012f2880c5095695789d8e7261bb05fea3dec0aee8c3b202" }, "downloads": -1, "filename": "dpcluster-0.103.tar.gz", "has_sig": false, "md5_digest": "d167dedfe7e0f30fb93ea0c58f16a66a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10236, "upload_time": "2013-05-02T18:35:09", "url": "https://files.pythonhosted.org/packages/21/07/c3ce05cc95322fdc67193011137f123685b7a3ca47aa86d0e5b52fd0bfe5/dpcluster-0.103.tar.gz" } ], "0.104": [ { "comment_text": "built for Linux-3.8.0-25-generic-x86_64-with-glibc2.7", "digests": { "md5": "7f36cf0c54e2cbb72df1977628b42d7a", "sha256": "6da1f8579697e35eea48330d6ad6bf2dfae76c016d02d99ae66d3decccb5720c" }, "downloads": -1, "filename": "dpcluster-0.104.linux-x86_64.tar.gz", "has_sig": false, "md5_digest": "7f36cf0c54e2cbb72df1977628b42d7a", "packagetype": "bdist_dumb", "python_version": "any", "requires_python": null, "size": 34712, "upload_time": "2013-06-21T22:10:16", "url": "https://files.pythonhosted.org/packages/e7/e8/9b669738b8d722a98ba090b2dee2c74a0622e92068af835069ef21cb7588/dpcluster-0.104.linux-x86_64.tar.gz" }, { "comment_text": "", "digests": { "md5": "29e2f806300fc7c6d150cb0258a5b487", "sha256": "b00d3e7b804e0ab6b80d59d3e9c895e0a3339b0144c3e9db955e65b24719cec4" }, "downloads": -1, "filename": "dpcluster-0.104.tar.gz", "has_sig": false, "md5_digest": "29e2f806300fc7c6d150cb0258a5b487", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14552, "upload_time": "2013-06-21T22:10:12", "url": "https://files.pythonhosted.org/packages/d6/c1/ef4c1ee0819cf4721adb2bacdfeeca7492e659f3157dfac47721d62a8cc9/dpcluster-0.104.tar.gz" } ] }, "urls": [ { "comment_text": "built for Linux-3.8.0-25-generic-x86_64-with-glibc2.7", "digests": { "md5": "7f36cf0c54e2cbb72df1977628b42d7a", "sha256": "6da1f8579697e35eea48330d6ad6bf2dfae76c016d02d99ae66d3decccb5720c" }, "downloads": -1, "filename": "dpcluster-0.104.linux-x86_64.tar.gz", "has_sig": false, "md5_digest": "7f36cf0c54e2cbb72df1977628b42d7a", "packagetype": "bdist_dumb", "python_version": "any", "requires_python": null, "size": 34712, "upload_time": "2013-06-21T22:10:16", "url": "https://files.pythonhosted.org/packages/e7/e8/9b669738b8d722a98ba090b2dee2c74a0622e92068af835069ef21cb7588/dpcluster-0.104.linux-x86_64.tar.gz" }, { "comment_text": "", "digests": { "md5": "29e2f806300fc7c6d150cb0258a5b487", "sha256": "b00d3e7b804e0ab6b80d59d3e9c895e0a3339b0144c3e9db955e65b24719cec4" }, "downloads": -1, "filename": "dpcluster-0.104.tar.gz", "has_sig": false, "md5_digest": "29e2f806300fc7c6d150cb0258a5b487", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14552, "upload_time": "2013-06-21T22:10:12", "url": "https://files.pythonhosted.org/packages/d6/c1/ef4c1ee0819cf4721adb2bacdfeeca7492e659f3157dfac47721d62a8cc9/dpcluster-0.104.tar.gz" } ] }