{ "info": { "author": "Karolina Sienkiewicz", "author_email": "sienkiewicz2k@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "sumo: subtyping tool for multi-omic data\n========================================\n\n|badge1| |badge2| |badge3| |badge4|\n\n.. |badge1| image:: https://travis-ci.org/ratan-lab/sumo.svg?branch=master\n :target: https://travis-ci.org/ratan-lab/sumo\n.. |badge2| image:: https://img.shields.io/github/license/ratan-lab/sumo\n :alt: GitHub\n.. |badge3| image:: https://readthedocs.org/projects/python-sumo/badge/?version=latest\n :target: https://python-sumo.readthedocs.io/en/latest/?badge=latest\n :alt: Documentation Status\n.. |badge4| image:: https://img.shields.io/pypi/v/python-sumo\n :alt: PyPI\n\n.. inclusion-start-marker-do-not-remove\n\n.. short-description-start-marker-do-not-remove\n\n**sumo** is a command-line tool to identify molecular subtypes in multi-omics datasets. It implements a novel nonnegative matrix factorization (NMF) algorithm to identify groups of samples that share molecular signatures, and provides tools to evaluate such assignments.\n\n.. short-description-end-marker-do-not-remove\n\nInstallation\n------------\nYou can install **sumo** from PyPI, by executing command below. Please note that we require python 3.6+.\n\n.. code:: sh\n\n pip install python-sumo\n\nDependencies\n------------\n\n- python 3.6+\n- python libraries:\n\n - `NumPy `__\n - `pandas `__\n - `SciPy `__\n - `scikit-learn `__\n - `Matplotlib `__\n - `Seaborn `__\n\nOptional requirements\n^^^^^^^^^^^^^^^^^^^^^\n\n- `pytest `__ (for running the test suite)\n- `Sphinx `__ (for generating documentation)\n\nDocumentation\n-------------\nThe official documentation is available at https://python-sumo.readthedocs.io\n\nLicense\n-------\n\n`MIT `__\n\n\nUsage\n-----\n\nTypical workflow includes running *prepare* mode for preparation of similarity\nmatrices from feature matrices, followed by factorization of produced multiplex network (mode *run*).\nThird mode *evaluate* can be used for comparison of created cluster labels against biologically significant labels.\n\nprepare\n^^^^^^^\nGenerates similarity matrices for samples based on biological data and saves them into multiplex network files.\n\n::\n\n Usage:\n sumo prepare [-h] [-method {rbf,pearson,spearman}] [-k K]\n [-alpha ALPHA] [-missing MISSING] [-names NAMES] [-sn SN]\n [-fn FN] [-df DF] [-ds DS] [-logfile LOGFILE]\n [-log {DEBUG,INFO,WARNING}] [-plot PLOT]\n infile1,infile2,... var1,var2,... outfile.npz\n\n Positional arguments:\n infile1,infile2,... comma-delimited list of paths to input .npz or .txt\n files (all input files should be structured in\n following way: consecutive samples in columns,\n consecutive features in rows\")\n var1(,var2,...) either one variable type for every data matrix in\n input file(s) or comma-delimited list of variable\n types ['continuous', 'binary', 'categorical']\n outfile.npz path to output .npz file\n\n Optional arguments:\n -h, --help show this help message and exit\n -method {rbf,pearson,spearman}\n method of sample-sample similarity calculation\n (default of \"rbf\")\n -k K fraction of nearest neighbours to use for sample\n similarity calculation using RBF method (default of 0.1)\n -alpha ALPHA hypherparameter of RBF similarity kernel (default of 0.5)\n -missing MISSING acceptable fraction of available values for assessment\n of distance/similarity between pairs of samples (default of 0.1)\n -names NAMES optional key of array containing custom sample names\n in every .npz file (if not set ids of samples are used,\n which can cause problems when layers have missing samples)\n -sn SN index of row with sample names for .txt input files\n (default of 0)\n -fn FN index of column with feature names for .txt input files\n (default of 0)\n -df DF if percentage of missing values for feature exceeds\n this value, remove feature (default of 0.1)\n -ds DS if percentage of missing values for sample (that\n remains after feature dropping) exceeds this value,\n remove sample (default of 0.1)\n -logfile LOGFILE path to save log file, by default stdout is used\n -log {DEBUG,INFO,WARNING}\n Sets the logging level (default of INFO)\n -plot PLOT path to save adjacency matrix heatmap(s),\n by default plots are displayed on screen\n\n**Example**\n\n.. code:: sh\n\n sumo prepare -plot plot.png methylation.txt,expression.txt continuous prepared.data.npz\n\nrun\n^^^\nCluster multiplex network using non-negative matrix tri-factorization to identify molecular subtypes.\n\n::\n\n Usage:\n sumo run [-h] [-sparsity SPARSITY] [-n N]\n [-method {max_value,spectral}] [-max_iter MAX_ITER] [-tol TOL]\n [-calc_cost CALC_COST] [-logfile LOGFILE]\n [-log {DEBUG,INFO,WARNING}] [-h_init H_INIT] [-t T]\n infile.npz k outdir\n\n Positional arguments:\n infile.npz input .npz file containing adjacency matrices for\n every network layer and sample names (file created by\n running program with mode \"run\") - consecutive\n adjacency arrays in file are indexed in following way:\n \"0\", \"1\" ... and index of sample name vector is \"samples\"\n k either one value describing number of clusters or\n coma-delimited range of values to check (sumo will\n suggest cluster structure based on cophenetic\n correlation coefficient)\n outdir path to save output files\n\n Optional arguments:\n -h, --help show this help message and exit\n -sparsity SPARSITY either one value or coma-delimited list of sparsity\n penalty values for H matrix (sumo will try different\n values and select the best results; default of\n [0.0001, 0.001, 0.01, 0.1, 1, 10.0, 100.0])\n -n N number of repetitions (default of 50)\n -method {max_value,spectral}\n method of cluster extraction (default of \"max_value\")\n -max_iter MAX_ITER maximum number of iterations for factorization\n (default of 500)\n -tol TOL if objective cost function value fluctuation (|\u0394\u2112|) is\n smaller than this value, stop iterations before\n reaching max_iter (default of 1e-05)\n -calc_cost CALC_COST number of steps between every calculation of objective\n cost function (default of 20)\n -logfile LOGFILE path to save log file (by default printed to stdout)\n -log {DEBUG,INFO,WARNING}\n Set the logging level (default of INFO)\n -h_init H_INIT index of adjacency matrix to use for H matrix\n initialization (by default using average adjacency)\n -t T number of threads (default of 1)\n\n**Example**\n\n.. code:: sh\n\n sumo run -t 10 prepared.data.npz 2,5 results_dir\n\nevaluate\n^^^^^^^^\nEvaluate clustering results, given set of labels.\n\n::\n\n Usage:\n sumo evaluate [-h] [-npz NPZ] [-metric {NMI,purity,ARI}]\n [-logfile LOGFILE]\n infile.npz labels\n\n\n Positional arguments:\n infile.npz input .npz file containing array indexed as\n 'clusters', with sample names in first column and\n clustering labels in second column (file created by\n running sumo with mode 'run')\n labels either .npy file containing array with sample names in\n first column and labels in second column or .npz\n file (requires using '-npz' option)\n\n Optional arguments:\n -h, --help show this help message and exit\n -npz NPZ key of array containing labels in .npz file\n -metric {NMI,purity,ARI}\n metric for accuracy evaluation (by default all metrics\n are calculated)\n -logfile LOGFILE path to save log file (by default printed to stdout)\n\n**Example**\n\n.. code:: sh\n\n sumo evaluate -npz subtypes results_dir/k3/sumo_results.npz labels.npz\n\n.. inclusion-end-marker-do-not-remove\n\n.. Please refer to documentation for more detailed description of a method,\n.. example usage cases and suggestions for data pre-preparation.\n\n\n", "description_content_type": "text/x-rst", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ratan-lab/sumo", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "python-sumo", "package_url": "https://pypi.org/project/python-sumo/", "platform": "", "project_url": "https://pypi.org/project/python-sumo/", "project_urls": { "Homepage": "https://github.com/ratan-lab/sumo" }, "release_url": "https://pypi.org/project/python-sumo/0.1.2/", "requires_dist": [ "matplotlib", "numpy", "pandas", "scikit-learn", "scipy", "seaborn" ], "requires_python": "", "summary": "**sumo** is a command-line tool to identify molecular subtypes in multi-omics datasets. It implements a novel nonnegative matrix factorization (NMF) algorithm to identify groups of samples that share molecular signatures, and provides tools to evaluate such assignments.", "version": "0.1.2" }, "last_serial": 5863802, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "8e0c3f603db201b6a99890e9a4865d38", "sha256": "682ae84896ad88c36f43833b791a80c751d403a10ed279029552bff2779d3af6" }, "downloads": -1, "filename": "python_sumo-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "8e0c3f603db201b6a99890e9a4865d38", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 33329, "upload_time": "2019-09-16T20:21:12", "url": "https://files.pythonhosted.org/packages/49/bd/f7751ad7d6f54b91cc07e513f825b596795beb6503bd83854f301d17afec/python_sumo-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6d0c31ce2e82c167ed9f932dfd73a440", "sha256": "db069b41166a7765879d663e9aa6f3b29bc2e1c404b62cb964f87244d874bba8" }, "downloads": -1, "filename": "python-sumo-0.1.0.tar.gz", "has_sig": false, "md5_digest": "6d0c31ce2e82c167ed9f932dfd73a440", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 34632, "upload_time": "2019-09-16T20:21:15", "url": "https://files.pythonhosted.org/packages/a3/ac/6cd9e7ec05d7f71d080a8c27c52bec7d05b41796d00f1d64a9997f1e55f3/python-sumo-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "7b408dcc63a346391bf126f3b71608a2", "sha256": "9f0345e32ecabe7ed8b08224f7f9520822790023071cd973bc860063c70a37c2" }, "downloads": -1, "filename": "python_sumo-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "7b408dcc63a346391bf126f3b71608a2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 36305, "upload_time": "2019-09-16T21:56:43", "url": "https://files.pythonhosted.org/packages/ff/38/eb0b613521971a4dd3032a055714bb80d4b6c00ae924b1b6b833fd21c8d1/python_sumo-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "328c9c3bdfd5aa0fc03f36bd6679c8de", "sha256": "d9850858c1aef1f9f84860fe6cb4c21c043e961bd5885e0cce5ca8f99a1bc95d" }, "downloads": -1, "filename": "python-sumo-0.1.1.tar.gz", "has_sig": false, "md5_digest": "328c9c3bdfd5aa0fc03f36bd6679c8de", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 34707, "upload_time": "2019-09-16T21:56:44", "url": "https://files.pythonhosted.org/packages/a9/0a/9a5bae0a03ba6d2272891a5aac80f9d9e977340b05ae2de562581bc497fd/python-sumo-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "eb6820a69af7048a28afc73d19f9c1b9", "sha256": "228bb34c952a0bec588ad5dd62abb323196deab5c80af1cb18afe2ec875dc218" }, "downloads": -1, "filename": "python_sumo-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "eb6820a69af7048a28afc73d19f9c1b9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 33704, "upload_time": "2019-09-20T19:25:14", "url": "https://files.pythonhosted.org/packages/e6/29/d2d80d41a6619d561d927cb8f272153743eb756dbb87d522397a3d0a9c7e/python_sumo-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "722d587ee7934d5b1ab9e68707db3800", "sha256": "8d2e23f91b56ec717d2ea34a17e72884ec6deece2b949acb2d2c91c8a189d85e" }, "downloads": -1, "filename": "python-sumo-0.1.2.tar.gz", "has_sig": false, "md5_digest": "722d587ee7934d5b1ab9e68707db3800", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 40939, "upload_time": "2019-09-20T19:25:15", "url": "https://files.pythonhosted.org/packages/51/36/736bbebc662915adaf66d7e63743837a6c7c704b67e03b9afce1a5b82cf4/python-sumo-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "eb6820a69af7048a28afc73d19f9c1b9", "sha256": "228bb34c952a0bec588ad5dd62abb323196deab5c80af1cb18afe2ec875dc218" }, "downloads": -1, "filename": "python_sumo-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "eb6820a69af7048a28afc73d19f9c1b9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 33704, "upload_time": "2019-09-20T19:25:14", "url": "https://files.pythonhosted.org/packages/e6/29/d2d80d41a6619d561d927cb8f272153743eb756dbb87d522397a3d0a9c7e/python_sumo-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "722d587ee7934d5b1ab9e68707db3800", "sha256": "8d2e23f91b56ec717d2ea34a17e72884ec6deece2b949acb2d2c91c8a189d85e" }, "downloads": -1, "filename": "python-sumo-0.1.2.tar.gz", "has_sig": false, "md5_digest": "722d587ee7934d5b1ab9e68707db3800", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 40939, "upload_time": "2019-09-20T19:25:15", "url": "https://files.pythonhosted.org/packages/51/36/736bbebc662915adaf66d7e63743837a6c7c704b67e03b9afce1a5b82cf4/python-sumo-0.1.2.tar.gz" } ] }