{ "info": { "author": "snfpy developers", "author_email": "rossmarkello@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6" ], "description": "# SNFpy\n\nThis package provides a Python implementation of similarity network fusion (SNF), a technique for combining multiple data sources into a single graph representing sample relationships.\n\n[![Build Status](https://travis-ci.org/rmarkello/snfpy.svg?branch=master)](https://travis-ci.org/rmarkello/snfpy)\n[![Codecov](https://codecov.io/gh/rmarkello/snfpy/branch/master/graph/badge.svg)](https://codecov.io/gh/rmarkello/snfpy)\n[![Documentation Status](https://readthedocs.org/projects/snfpy/badge/?version=latest)](https://snfpy.readthedocs.io/en/latest/?badge=latest)\n[![License](https://img.shields.io/badge/License-LGPL%203.0-blue.svg)](https://opensource.org/licenses/LGPL-3.0)\n\n## Table of Contents\n\nIf you know where you're going, feel free to jump ahead:\n\n* [Installation](#requirements-and-installation)\n* [Purpose](#purpose)\n* [Example usage](#usage)\n* [How to get involved](#how-to-get-involved)\n* [Acknowlegments](#acknowledgments)\n* [License information](#license-information)\n\n## Requirements and installation\n\nThis package requires Python version 3.5 or greater.\nAssuming you have the correct version of Python, you can install this package by opening a command terminal and running the following:\n\n```bash\ngit clone https://github.com/rmarkello/snfpy.git\ncd snfpy\npython setup.py install\n```\n\nYou can install the latest release from PyPi, with:\n\n```bash\npip install snfpy\n```\n\n## Purpose\n\nSimilarity network fusion is a technique originally proposed by [Wang et al., 2014, Nature Methods](https://www.ncbi.nlm.nih.gov/pubmed/24464287) to combine data from different sources for a shared group of samples.\nThe procedure works by constructing networks of these samples for each data source that represent how *similar* each sample is to all the others, and then fusing the networks together.\nThis figure from the original paper, which applied the method to genetics data, provides a nice demonstration:\n\n![Similarity network fusion](https://media.nature.com/lw926/nature-assets/nmeth/journal/v11/n3/images/nmeth.2810-F1.jpg)\n\nThe similarity network generation and fusion process use a [K-nearest neighbors](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) procedure to down-weight weaker relationships between samples.\nHowever, weak relationships that are consistent across data sources are retained via the fusion process.\nFor more information on the math behind SNF you can read the [original paper](https://www.ncbi.nlm.nih.gov/pubmed/24464287) or take a look at the [reference documentation](https://snfpy.readthedocs.io/en/latest/api.html).\n\n## Usage\n\nThere are ~three functions from which you can construct _most_ SNF workflows.\nTo demonstrate, we can use one of the example datasets provided with the `snfpy` distribution:\n\n```python\n>>> from snf import datasets\n>>> digits = datasets.load_digits()\n>>> digits.keys()\ndict_keys(['data', 'labels'])\n```\n\nThe `digits` dataset is comprised of 600 samples profiled across four data types which have 76, 240, 216, and 47 features each; these data can be accessed via `digits.data`:\n\n```python\n>>> for arr in digits.data:\n... print(arr.shape)\n(600, 76)\n(600, 240)\n(600, 216)\n(600, 47)\n```\n\nThe dataset also comes with a set of labels grouping the samples into one of three categories (0, 1, or 2); these labels can be accessed via `digits.labels`.\nWe can see that there are 200 samples per group:\n\n```python\nimport numpy as np\n>>> groups, samples = np.unique(digits.labels, return_counts=True)\n>>> for grp, count in zip(groups, samples):\n... print('Group {:.0f}: {} samples'.format(grp, count))\nGroup 0: 200 samples\nGroup 1: 200 samples\nGroup 2: 200 samples\n```\n\nThe first step in SNF is converting these data arrays into similarity (or \"affinity\") networks.\nTo do so, we provide the data arrays to the `snf.make_affinity()` function:\n\n```python\n>>> import snf\n>>> affinity_networks = snf.make_affinity(digits.data, metric='euclidean', K=20, mu=0.5)\n```\n\nThe arguments provided to this function (`metric`, `K`, and `mu`) all play a role in how the similarity network is constructed.\nThe `metric` argument controls the distance function used when constructing the network.\nDistance networks and similarity networks are inverses of one another, however, so we convert from one to the another.\nTo do so, we apply a `mu`-weighted `K`-nearest neighbors kernel to the distance array (for more math, see the [`snf.make_affinity()` docstring](https://snfpy.readthedocs.io/en/latest/generated/snf.compute.make_affinity.html)).\n\nOnce we have our similarity networks we can fuse them together with SNF:\n\n```python\n>>> fused_network = snf.snf(affinity_networks, K=20)\n```\n\nThe fused network is then ready for other analyses (like clustering!).\nIf we want to cluster the network we can try and determine the \"optimal\" number of clusters to form using `snf.get_n_clusters()`.\nThis functions returns the top _two_ \"optimal\" number of clusters (estimated via an eigengap approach):\n\n```python\n>>> best, second = snf.get_n_clusters(fused_network)\n>>> best, second\n(3, 4)\n```\n\nThen, we can cluster the network using, for example, spectral clustering and compare the resulting labels to the \"true\" labels provided with the test dataset:\n\n```python\n>>> from sklearn.cluster import spectral_clustering\n>>> from sklearn.metrics import v_measure_score\n\n>>> labels = spectral_clustering(fused_network, n_clusters=best)\n>>> v_measure_score(labels, digits.labels)\n0.9734300455833589\n```\n\nThe metric (`v_measure_score`) used here ranges from 0 to 1, where 1 indicates perfect overlap between the derived and true labels and 0 indicates no overlap.\nWe see that 0.97 indicates that the SNF fused network clustering results were highly accurate!\n\nOf course, your mileage may (will) vary for other datasets, but this should be sufficient to get you started!\nFor more detailed examples and alternative uses check out our [documentation](https://snfpy.readthedocs.io).\n\n## How to get involved\n\nWe're thrilled to welcome new contributors!\nIf you're interesting in getting involved, you should start by reading our [contributing guidelines](https://github.com/rmarkello/snfpy/blob/master/CONTRIBUTING.md) and [code of conduct](https://github.com/rmarkello/snfpy/blob/master/CODE_OF_CONDUCT.md).\nOnce you're done with that, take a look at our [issues](https://github.com/rmarkello/snfpy/issues) to see if there is anything you might like to work on.\nAlternatively, if you've found a bug, are experiencing a problem, or have a question, create a new issue with some information about it!\n\n## Acknowledgments\n\nThis code is a translation of the original similarity network fusion code implemented in [R](http://compbio.cs.toronto.edu/SNF/SNF/Software.html) and [Matlab](http://compbio.cs.toronto.edu/SNF/SNF/Software.html).\nAs such, if you use this code please (1) provide a link back to the `snfpy` GitHub repository with the version of the code used, and (2) cite the original similarity network fusion paper:\n\n Wang, B., Mezlini, A. M., Demir, F., Fiume, M., Tu, Z., Brudno, M., Haibe-Kains, B., &\n Goldenberg, A. (2014). Similarity network fusion for aggregating data types on a genomic scale.\n Nature Methods, 11(3), 333.\n\n## License information\n\nThis code is partially translated from the [original SNF code](http://compbio.cs.toronto.edu/SNF/SNF/Software.html) and is therefore released as a derivative under the [LGPLv3](https://github.com/rmarkello/snfpy/blob/master/LICENSE).\nThe full license may be found in the [LICENSE](https://github.com/rmarkello/snfpy/blob/master/LICENSE) file in the `snfpy` distribution.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/rmarkello/snfpy/archive/0.2.1.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/rmarkello/snfpy", "keywords": "", "license": "LGPGv3", "maintainer": "Ross Markello", "maintainer_email": "rossmarkello@gmail.com", "name": "snfpy", "package_url": "https://pypi.org/project/snfpy/", "platform": "", "project_url": "https://pypi.org/project/snfpy/", "project_urls": { "Download": "https://github.com/rmarkello/snfpy/archive/0.2.1.tar.gz", "Homepage": "http://github.com/rmarkello/snfpy" }, "release_url": "https://pypi.org/project/snfpy/0.2.1/", "requires_dist": [ "numpy (>=1.14)", "scikit-learn", "scipy", "pytest-cov; extra == 'all'", "sphinx (>=1.2); extra == 'all'", "pytest (>=3.6); extra == 'all'", "sphinx-rtd-theme; extra == 'all'", "sphinx (>=1.2); extra == 'doc'", "sphinx-rtd-theme; extra == 'doc'", "pytest (>=3.6); extra == 'tests'", "pytest-cov; extra == 'tests'" ], "requires_python": "", "summary": "snfpy is a Python toolbox for performing similarity network fusion.", "version": "0.2.1" }, "last_serial": 4846443, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "ebbbdd9da4ed8799814984eff64e1a78", "sha256": "278aa7acfb7f607af42343032e952314fecf2c5db710ad075351ee8488ceb305" }, "downloads": -1, "filename": "snfpy-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "ebbbdd9da4ed8799814984eff64e1a78", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 21522, "upload_time": "2018-05-14T21:15:20", "url": "https://files.pythonhosted.org/packages/eb/49/ef199c4ac451edb40d51a3c2be7a68108258546577df65b78ea390db3d3b/snfpy-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ca3129ff36b730f79e396d14bd8f756c", "sha256": "7395a0204107b5d6110e60a01dfc601750f220315407e4c2bfdc1e74ee4edff5" }, "downloads": -1, "filename": "snfpy-0.1.tar.gz", "has_sig": false, "md5_digest": "ca3129ff36b730f79e396d14bd8f756c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16459, "upload_time": "2018-05-14T21:15:21", "url": "https://files.pythonhosted.org/packages/a4/11/4a55b9fa4dc376236ca16b4bbc56765f285ff407ee1ae9a89150dacfeeff/snfpy-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "34a0c57efc8e0a786b75ece18d912e7a", "sha256": "0da182bbca6adae264c5cc70a64e8e1fdf93909c4d477833f364932c06a32077" }, "downloads": -1, "filename": "snfpy-0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "34a0c57efc8e0a786b75ece18d912e7a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 26215, "upload_time": "2019-02-18T21:52:30", "url": "https://files.pythonhosted.org/packages/e2/0f/7c6f32b5864651dd741e424b42a60f31c32b54b3c8e64be0d22ccbced62e/snfpy-0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4d57402bfee86e3b634573b174c58061", "sha256": "dd795040d40f595316512e40dee3a0dd8e274be340179dabeaa0803f2a67e85f" }, "downloads": -1, "filename": "snfpy-0.2.tar.gz", "has_sig": false, "md5_digest": "4d57402bfee86e3b634573b174c58061", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54710, "upload_time": "2019-02-18T21:52:32", "url": "https://files.pythonhosted.org/packages/1b/79/4a56cc2543d5077114b90ed045efb617c62f2169474b48aae1fba525341c/snfpy-0.2.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "da52490ad230cf2ce116bb913fe9fdb8", "sha256": "0e0a6357fed032c2f48a6820e2e6882449cb2c365080129d1bee4c36388be7df" }, "downloads": -1, "filename": "snfpy-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "da52490ad230cf2ce116bb913fe9fdb8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 536093, "upload_time": "2019-02-20T16:26:39", "url": "https://files.pythonhosted.org/packages/c7/b7/bf6acedc2f461006e5587585920c4d82e6fbec6f6eb36864bccbd25094ca/snfpy-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3a19bcfd095d642834c94df5f8ae5bf6", "sha256": "44fb0789496f8d7f65e95ec2b04baaa08be24b7dbe57f788b7bdddcb60b3ecbd" }, "downloads": -1, "filename": "snfpy-0.2.1.tar.gz", "has_sig": false, "md5_digest": "3a19bcfd095d642834c94df5f8ae5bf6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 546249, "upload_time": "2019-02-20T16:26:42", "url": "https://files.pythonhosted.org/packages/a6/60/29d5a07b61bc2f521bcbe514937b348ea15837f458d1e2351a3cb8ac3b73/snfpy-0.2.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "da52490ad230cf2ce116bb913fe9fdb8", "sha256": "0e0a6357fed032c2f48a6820e2e6882449cb2c365080129d1bee4c36388be7df" }, "downloads": -1, "filename": "snfpy-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "da52490ad230cf2ce116bb913fe9fdb8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 536093, "upload_time": "2019-02-20T16:26:39", "url": "https://files.pythonhosted.org/packages/c7/b7/bf6acedc2f461006e5587585920c4d82e6fbec6f6eb36864bccbd25094ca/snfpy-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3a19bcfd095d642834c94df5f8ae5bf6", "sha256": "44fb0789496f8d7f65e95ec2b04baaa08be24b7dbe57f788b7bdddcb60b3ecbd" }, "downloads": -1, "filename": "snfpy-0.2.1.tar.gz", "has_sig": false, "md5_digest": "3a19bcfd095d642834c94df5f8ae5bf6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 546249, "upload_time": "2019-02-20T16:26:42", "url": "https://files.pythonhosted.org/packages/a6/60/29d5a07b61bc2f521bcbe514937b348ea15837f458d1e2351a3cb8ac3b73/snfpy-0.2.1.tar.gz" } ] }