{ "info": { "author": "Gregory Giecold", "author_email": "ggiecold@jimmy.harvard.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: Console", "Intended Audience :: Developers", "Intended Audience :: End Users/Desktop", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: MacOS :: MacOS X", "Operating System :: POSIX", "Programming Language :: C", "Programming Language :: Python :: 2.7", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Mathematics", "Topic :: Scientific/Engineering :: Visualization" ], "description": "Cluster\\_Ensembles\r\n==================\r\n\r\nA package for combining multiple partitions into a consolidated\r\nclustering. The combinatorial optimization problem of obtaining such a\r\nconsensus clustering is reformulated in terms of approximation\r\nalgorithms for graph or hyper-graph partitioning.\r\n\r\nInstallation\r\n------------\r\n\r\nCluster\\_Ensembles is written in Python and in C. You need Python 2.7,\r\nits Standard Library and the following packages:\r\n\r\n- numexpr (version 2.4.0 or later)\r\n\r\n- NumPy (version 1.9.0 or any ulterior version)\r\n\r\n- SciPy\r\n\r\n- scikit-learn\r\n\r\n- setuptools\r\n\r\n- PyTables\r\n\r\nThe ``pip install Cluster_Ensembles`` command mentioned below should automatically detect and, if applicable, install or update any of the afore-mentioned dependencies.\r\n\r\nAs yet another prelimiary to running Cluster\\_Ensembles, you should also\r\nfollow the few more instructions below.\r\n\r\nOn CentOS, Fedora or some Red Hat Linux distribution:\r\n\r\n- open a terminal console;\r\n\r\n- type in: ``sudo dnf install glibc.i686``.\r\n\r\nThis will install the GNU C library that is required to run a 32-bit\r\nexecutable binary with a 64-bit Linux kernel. This executable is tasked\r\nwith hyper-graph partitioning. Skipping this step would result in a\r\n``bad ELF interpreter`` error message when subsequently trying to run\r\nthe Cluster\\_Ensembles package.\r\n\r\nOn a Debian or Ubuntu platform, the following commands should yield the\r\nsame outcome: \r\n\r\n- open a terminal console;\r\n\r\n- type in: ``sudo dpkg --add-architecture i386`` to add support for the \r\n i386 architecture; \r\n\r\n- enter: ``sudo apt-get install libc6:i386``.\r\n\r\nUpon completion of the steps outlined above, install Cluster\\_Ensembles\r\nby sending a request to the Python Package Index (PyPI) as follows: \r\n\r\n- open a terminal console;\r\n\r\n- enter ``pip install Cluster_Ensembles``.\r\n\r\nAny missing third-party dependency should be automatically resolved.\r\nPlease note that as part of the installation of this package, some code\r\nwritten in C that will later on be required by the Cluster\\_Ensembles\r\npackage to determine a graph partition is automatically compiled under\r\nthe hood and according to the specifications of your machine. You\r\ntherefore need to ensure availability of ``CMake`` and ``GNU make`` on\r\nyour operating system.\r\n\r\nUsage\r\n-----\r\n\r\nSay that you have an array of shape (M, N) where each row corresponds to a vector reporting the cluster label of each of the N samples comprising your dataset. It is possible that some of those samples have been left out of consideration from some of those M clusterings; in this case, the corresponding entry is tagged as NaN (``numpy.nan``).\r\n\r\nThe few lines below illustrate how to submit consensus clustering analysis such an cluster_runs (M, N) array of cluster labels. A vector holding the consensus clustering identities for each of the N samples in your dataset, ``consensus_clustering_labels``, is returned.\r\n\r\nPlease note that those M vectors of clustering labels can correspond to partitions of the samples into distinct numbers of overall clusters. Cluster_Ensembles therefore offers the possibility of seeking a consensus clustering from the aggregation of a clustering of your dataset into, say, 10 groups, another clustering of a fraction of your samples into 5 clusters, yet another partition of your dataset into 20 clusters, etc. Those choices are entirely up to you. Pretty much all that is required for Cluster_Ensembles is an array of clustering vectors.\r\n\r\n::\r\n\r\n >>> import numpy as np\r\n >>> import Cluster_Ensembles as CE\r\n >>> cluster_runs = np.random.randint(0, 50, (50, 15000))\r\n >>> consensus_clustering_labels = CE.cluster_ensembles(cluster_runs, verbose = True, N_clusters_max = 50)\r\n\r\nReferences\r\n----------\r\n\r\n- Giecold, G., Marco, E., Trippa, L. and Yuan, G.-C., \r\n \u201cRobust Lineage Reconstruction from High-Dimensional Single-Cell Data\u201d. \r\n ArXiv preprint [q-bio.QM, stat.AP, stat.CO, stat.ML]: http://arxiv.org/abs/1601.02748\r\n\r\n- A. Strehl and J. Ghosh, \"Cluster Ensembles - A Knowledge Reuse\r\n Framework for Combining Multiple Partitions\". In: Journal of Machine\r\n Learning Research, 3, pp. 583-617. 2002", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/GGiecold/Cluster_Ensembles", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/GGiecold/Cluster_Ensembles", "keywords": "aggregation clustering consensus consensus-clustering CSPA data-mining ensemble ensemble-clustering HGPA hyper-graph machine-learning MCLA partition pattern-recognition unsupervised-learning", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "Cluster_Ensembles", "package_url": "https://pypi.org/project/Cluster_Ensembles/", "platform": "", "project_url": "https://pypi.org/project/Cluster_Ensembles/", "project_urls": { "Download": "https://github.com/GGiecold/Cluster_Ensembles", "Homepage": "https://github.com/GGiecold/Cluster_Ensembles" }, "release_url": "https://pypi.org/project/Cluster_Ensembles/1.16/", "requires_dist": null, "requires_python": null, "summary": "A package for determining the consensus clustering from an ensemble of partitions", "version": "1.16" }, "last_serial": 1906994, "releases": { "1.14": [ { "comment_text": "", "digests": { "md5": "6c9dfa826ca06c62a5a74b5b52ce29d0", "sha256": "80428c2add63209660f7824b0a4f036eb22a1c2f158dc95b4597fe5d9372f241" }, "downloads": -1, "filename": "Cluster_Ensembles-1.14.tar.gz", "has_sig": false, "md5_digest": "6c9dfa826ca06c62a5a74b5b52ce29d0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5250612, "upload_time": "2015-12-08T04:37:05", "url": "https://files.pythonhosted.org/packages/9f/41/cc37e1306de75e01a08197d466c000dce7fcde249565169c71255ecd9789/Cluster_Ensembles-1.14.tar.gz" } ], "1.15": [ { "comment_text": "", "digests": { "md5": "ddc47c08dabaee19dde4773fb5b56386", "sha256": "5f240c46805e5f55d89f94a0a5c636c710e03d86d1b5985ef05fe46a29669ef5" }, "downloads": -1, "filename": "Cluster_Ensembles-1.15.tar.gz", "has_sig": false, "md5_digest": "ddc47c08dabaee19dde4773fb5b56386", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5252648, "upload_time": "2015-12-09T18:59:47", "url": "https://files.pythonhosted.org/packages/8f/d1/984bc7459c61213f319eb3b56e44d85fec220db93bc5d7bf1f515a16f34c/Cluster_Ensembles-1.15.tar.gz" } ], "1.16": [ { "comment_text": "", "digests": { "md5": "724dd36fe3b1f85105897605a0f10f98", "sha256": "acc4e45e4778c81c84dafcd6cdea0b33be5e2b90666962926ddb5feff46a7c54" }, "downloads": -1, "filename": "Cluster_Ensembles-1.16.tar.gz", "has_sig": false, "md5_digest": "724dd36fe3b1f85105897605a0f10f98", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5252660, "upload_time": "2015-12-23T20:06:47", "url": "https://files.pythonhosted.org/packages/ba/b4/6449b7ef22b47ff599dbb2c007db8f05e3362964e1a796f408804913a1c6/Cluster_Ensembles-1.16.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "724dd36fe3b1f85105897605a0f10f98", "sha256": "acc4e45e4778c81c84dafcd6cdea0b33be5e2b90666962926ddb5feff46a7c54" }, "downloads": -1, "filename": "Cluster_Ensembles-1.16.tar.gz", "has_sig": false, "md5_digest": "724dd36fe3b1f85105897605a0f10f98", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5252660, "upload_time": "2015-12-23T20:06:47", "url": "https://files.pythonhosted.org/packages/ba/b4/6449b7ef22b47ff599dbb2c007db8f05e3362964e1a796f408804913a1c6/Cluster_Ensembles-1.16.tar.gz" } ] }