{ "info": { "author": "Yusuke Matsubara", "author_email": "whym@whym.org", "bugtrack_url": null, "classifiers": [], "description": "=====================\r\nSCluster\r\n=====================\r\n--------------------------------------------------------\r\nan implementation of spectral clustering for documents\r\n--------------------------------------------------------\r\n\r\n :Homepage: http://github.com/whym/scluster\r\n :Contact: http://whym.org\r\n\r\nOverview\r\n==============================\r\nSpectral clustering a modern clustering technique considered to be effective for image clustering among others. [#]_ [#]_\r\n\r\nThis software find clusters among documents based on the bag-of-words representation [#]_ and TF-IDF weighting [#]_.\r\n\r\n.. [#] Ulrike von Luxburg, A Tutorial on Spectral Clustering, 2006. http://arxiv.org/abs/0711.0189\r\n.. [#] Chris H. Q. Ding, Spectral Clustering, 2004. http://ranger.uta.edu/~chqding/Spectral/\r\n.. [#] http://en.wikipedia.org/wiki/Bag_of_words_model\r\n.. [#] http://en.wikipedia.org/wiki/Tf%E2%80%93idf\r\n\r\nRequirements\r\n==============================\r\nFollowing softwares are required.\r\n\r\n- Python 2 or 3\r\n- Numpy\r\n- Scipy\r\n\r\nHow to use\r\n==============================\r\n1. Prepare documents as raw-text files, and put them in a directory, for example, 'reuters'.\r\n2. Prepare a category file. For example, 'cats.txt' may contain: ::\r\n\r\n 14833 palm-oil veg-oil\r\n 14839 ship\r\n\r\n This means that the file '14833' has 'palm-oil' and 'veg-oil' as\r\n its categories, and '14839' has 'ship' as its category.\r\n\r\n3. Run: ``python scluster/clusterer.py cats.txt reusters/ -m kmeans``,\r\n\r\nNotes\r\n==============================\r\n- When you use the Reuters set, notice No 17980 might contain\r\n non-Unicode character at Line 10. It should probably read: \"world\r\n economic growth-side measures ...\"\r\n\r\n.. [#] http://www.daviddlewis.com/resources/testcollections/reuters21578/", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/whym/scluster", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "scluster", "package_url": "https://pypi.org/project/scluster/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/scluster/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/whym/scluster" }, "release_url": "https://pypi.org/project/scluster/0.0.2/", "requires_dist": null, "requires_python": null, "summary": "an implementation of spectral clustering for text document collections", "version": "0.0.2" }, "last_serial": 1882320, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "c1634c8844d165b9be82ac4ee4022b3a", "sha256": "4a09cc6ad7199d536b05bacf5e0fccccb9b4fa0a71d57f3de438f1933442cbac" }, "downloads": -1, "filename": "scluster-0.0.1.tar.gz", "has_sig": false, "md5_digest": "c1634c8844d165b9be82ac4ee4022b3a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6610, "upload_time": "2015-12-30T13:07:34", "url": "https://files.pythonhosted.org/packages/0a/93/8be1ffc1602bb686428865c28bf3b6f3f4f5a435a8add376d2d2bf8cc8d9/scluster-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "bddeab556f84f542bc6376110a8679b3", "sha256": "18cdb698ccca8c2355b1ef9dbef1340f8ea6003b0cdec845d8f0507cb97b83ad" }, "downloads": -1, "filename": "scluster-0.0.2.tar.gz", "has_sig": false, "md5_digest": "bddeab556f84f542bc6376110a8679b3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6763, "upload_time": "2015-12-30T13:17:31", "url": "https://files.pythonhosted.org/packages/41/86/8cd37687f4f6580707e40ebc5f8722ba517cff4ec1c47f271b03eb047829/scluster-0.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "bddeab556f84f542bc6376110a8679b3", "sha256": "18cdb698ccca8c2355b1ef9dbef1340f8ea6003b0cdec845d8f0507cb97b83ad" }, "downloads": -1, "filename": "scluster-0.0.2.tar.gz", "has_sig": false, "md5_digest": "bddeab556f84f542bc6376110a8679b3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6763, "upload_time": "2015-12-30T13:17:31", "url": "https://files.pythonhosted.org/packages/41/86/8cd37687f4f6580707e40ebc5f8722ba517cff4ec1c47f271b03eb047829/scluster-0.0.2.tar.gz" } ] }