{ "info": { "author": "Ruggles Lab", "author_email": "ruggleslab@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: MacOS", "Operating System :: Unix", "Programming Language :: Python :: 3.7" ], "description": "# BlackSheep\n##### A tool for differential extreme-value analysis\n\n### Installation\nWith pip\n```bash\npip install blksheep\n```\nWith conda\n```bash\nconda install -c bioconda blksheep\n```\n\n### Requirements (automatically taken care of with pip and conda)\npandas \nnumpy \nmatplotlib \nseaborn \nscipy \nscikit-learn \nstatsmodels \n\n### Documentation\nhttps://blacksheep.readthedocs.io/en/latest/index.html\n\n### Usage\n##### In python\n```python\nimport blacksheep\n\n# Read in data\nvalues_file = '' #insert values file here\nannotations_file = '' #insert annotations file here\nvalues = deva.read_in_values(values_file)\nannotations = deva.read_in_values(annotations_file)\n\n# Binarize annotation columns\nannotations = blacksheep.binarize_annotations(annotations)\n\n# Run outliers comparative analysis\noutliers, qvalues = blacksheep.deva(\n values, annotations,\n save_outlier_table=True,\n save_qvalues=True,\n save_comparison_summaries=True\n)\n\n# Pull out results\nqvalues_table = qvalues.df\nvis_table = outliers.frac_table\n\n# Make heatmaps for significant genes\nfor col in annotations.columns:\n axs = blacksheep.plot_heatmap(annotations, qvalues_table, col, vis_table, savefig=True)\n\n# Normalize values\nphospho = blacksheep.read_in_values('') #Fill in file here\nprotein = blacksheep.read_in_values('') #Fill in file here\n\n\n```\n\n##### Command line interface\n*Example*\n```bash\nblacksheep binarize annotations.tsv --output_prefix annotations_test\nblacksheep deva values.csv annotations_test.binarized.tsv --output_prefix test \\\n--write_outlier_table --write_comparison_summaries --write_gene_list \\\n--make_heatmaps\n```\n\n*Full help* \nJust make the outliers table:\n```bash\nusage: blacksheep outliers_table [-h] [--output_prefix OUTPUT_PREFIX] [--iqrs IQRS]\n [--up_or_down {up,down}] [--ind_sep IND_SEP]\n [--do_not_aggregate] [--write_frac_table]\n values\n\nTakes a table of values and converts to a table of outlier counts.\n\npositional arguments:\n values File path to input values. Columns must be samples,\n genes must be sites or genes. Only .tsv and .csv\n accepted.\n\noptional arguments:\n -h, --help show this help message and exit\n --output_prefix OUTPUT_PREFIX\n Output prefix for writing files. Default outliers.\n --iqrs IQRS Number of interquartile ranges (IQRs) above or below\n the median to consider a value an outlier. Default is\n 1.5 IQRs.\n --up_or_down {up,down}\n Whether to look for up or down outliers. Choices are\n up or down. Default up.\n --ind_sep IND_SEP If site labels have a parent molecule (e.g. a gene\n name such as ATM) and a site identifier (e.g. S365)\n this is the delimiter between the two elements.\n Default is -\n --do_not_aggregate Use flag if you do not want to sum outliers based on\n site prefixes.\n --write_frac_table Use flag if you want to write a table with fraction of\n values per site, per sample that are outliers. Will\n not be written by default. Useful for visualization.\n```\n\nBinarize the columns in an annotations table. \n**Warning: do not include non-categorical columns, or columns you don't want binarized. You'll\n end up with a huge un-wieldly table. **\n```bash\nusage: blacksheep binarize [-h] [--output_prefix OUTPUT_PREFIX] annotations\n\nTakes an annotation table where some columns may have more than 2 possible\nvalues (not including empty/null values) and outputs an annotation table with\nonly two values per annotation. Propagates null values.\n\npositional arguments:\n annotations Annotation table with samples as rows and annotation\n labels as columns.\n\noptional arguments:\n -h, --help show this help message and exit\n --output_prefix OUTPUT_PREFIX\n Output prefix for writing files. Default outliers.\n```\n\nCompare all the groups described in columns of an annotation table using outlier counts\n```bash\nusage: blacksheep compare_groups [-h] [--output_prefix OUTPUT_PREFIX]\n [--frac_filter FRAC_FILTER]\n [--write_comparison_summaries] [--iqrs IQRS]\n [--up_or_down {up,down}] [--write_gene_list]\n [--make_heatmaps] [--fdr FDR]\n [--red_or_blue {red,blue}]\n [--annotation_colors ANNOTATION_COLORS]\n outliers_table annotations\n\nTakes an annotation table and outlier count table (output of outliers_table)\nand outputs qvalues from a statistical test that looks for enrichment of\noutlier values in each group in the annotation table. For each value in each\ncomparison, the qvalue table will have 1 column, if there are any genes in\nthat comparison.\n\npositional arguments:\n outliers_table Table of outlier counts (output of outliers_table).\n Must be .tsv or .csv file, with outlier and non-\n outlier counts as columns, and genes/sites as rows.\n annotations Table of annotations. Must be .csv or .tsv. Samples as\n rows and comparisons as columns. Comparisons must have\n only unique values (not including missing values). If\n there are more options than that, you can use binarize\n to prepare the table.\n\noptional arguments:\n -h, --help show this help message and exit\n --output_prefix OUTPUT_PREFIX\n Output prefix for writing files. Default outliers.\n --frac_filter FRAC_FILTER\n The minimum fraction of samples per group that must\n have an outlier in a gene toconsider that gene in the\n analysis. This is used to prevent a high number of\n outlier values in 1 sample from driving a low qvalue.\n Default 0.3\n --write_comparison_summaries\n Use flag to write a separate file for each column in\n the annotations table, with outlier counts in each\n group, p-values and q-values in each group.\n --iqrs IQRS Number of IQRs used to define outliers in the input\n count table. Optional.\n --up_or_down {up,down}\n Whether input outlier table represents up or down\n outliers. Needed for output file labels. Default up\n --write_gene_list Use flag to write a list of significantly enriched\n genes for each value in each comparison. If used, need\n an fdr threshold as well.\n --make_heatmaps Use flag to draw a heatmap of signficantly enriched\n genes for each value in each comparison. If used, need\n an fdr threshold as well.\n --fdr FDR FDR cut off to use for signficantly enriched gene\n lists and heatmaps. Default 0.05\n --red_or_blue {red,blue}\n If --make_heatmaps is called, color of values to draw\n on heatmap. Default red.\n --annotation_colors ANNOTATION_COLORS\n File with color map to use for annotation header if\n --make_heatmaps is used. Must have a 'value color'\n format for each value in annotations. Any value not\n represented will be assigned a new color.\n```\n\nMake heatmaps visualizing enriched genes for each group in an annotation table\n```bash\nusage: blacksheep visualize [-h] [--output_prefix OUTPUT_PREFIX]\n [--annotations_to_show ANNOTATIONS_TO_SHOW [ANNOTATIONS_TO_SHOW ...]]\n [--fdr FDR] [--red_or_blue {red,blue}]\n [--annotation_colors ANNOTATION_COLORS]\n [--write_gene_list]\n comparison_qvalues annotations visualization_table\n comparison_of_interest\n\nUsed to make custom heatmaps from significant genes.\n\npositional arguments:\n comparison_qvalues Table of qvalues, output from compare_groups. Must be\n .csv or .tsv. Has genes/sites as rows and comparison\n values as columns.\n annotations Table of annotations used to generate qvalues.\n visualization_table Values to visualize in heatmap. Samples as columns and\n genes/sites as rows. Using outlier fraction table is\n recommended, but original values can also be used if\n no aggregation was used.\n comparison_of_interest\n Name of column in qvalues table from which to\n visualize significant genes.\n\noptional arguments:\n -h, --help show this help message and exit\n --output_prefix OUTPUT_PREFIX\n Output prefix for writing files. Default outliers.\n --annotations_to_show ANNOTATIONS_TO_SHOW [ANNOTATIONS_TO_SHOW ...]\n Names of columns from the annotation table to show in\n the header of the heatmap. Default is all columns.\n --fdr FDR FDR threshold to use to select genes to visualize.\n Default 0.05\n --red_or_blue {red,blue}\n Color of values to draw on heatmap. Default red.\n --annotation_colors ANNOTATION_COLORS\n File with color map to use for annotation header. Must\n have a line with 'value color' format for each value\n in annotations. Any value not represented will be\n assigned a new color.\n --write_gene_list Use flag to write a list of significantly enriched\n genes for each value in each comparison.\n```\n\nRun the whole pipeline: call outliers, perform comparisons on all groups in an annotation table\n, optionally make heatmaps for each group.\n```bash\nusage: blacksheep deva [-h] [--output_prefix OUTPUT_PREFIX] [--iqrs IQRS]\n [--up_or_down {up,down}] [--do_not_aggregate]\n [--write_outlier_table] [--write_frac_table]\n [--ind_sep IND_SEP] [--frac_filter FRAC_FILTER]\n [--write_comparison_summaries] [--fdr FDR]\n [--write_gene_list] [--make_heatmaps]\n [--red_or_blue {red,blue}]\n [--annotation_colors ANNOTATION_COLORS]\n values annotations\n\nRuns whole outliers pipeline. Has options to output every possible output.\n\npositional arguments:\n values File path to input values. Samples are columns and\n genes/sites are rows. Only .tsv and .csv accepted.\n annotations File path to annotation values. Rows are sample names,\n header is different annotations. e.g. mutation status.\n\noptional arguments:\n -h, --help show this help message and exit\n --output_prefix OUTPUT_PREFIX\n Output prefix for writing files. Default outliers.\n --iqrs IQRS Number of inter-quartile ranges (IQRs) above or below\n the median to consider a value an outlier. Default is\n 1.5.\n --up_or_down {up,down}\n Whether to look for up or down outliers. Choices are\n up or down. Default up.\n --do_not_aggregate Use flag if you do not want to sum outliers based on\n site prefixes.\n --write_outlier_table\n Use flag to write a table of outlier counts.\n --write_frac_table Use flag if you want to write a table with fraction of\n values per site per sample that are outliers. Useful\n for custom visualization.\n --ind_sep IND_SEP If site labels have a parent molecule (e.g. a gene\n name such as ATM) and a site identifier (e.g. S365)\n this is the delimiter between the two elements.\n Default is -\n --frac_filter FRAC_FILTER\n The minimum fraction of samples per group that must\n have an outlier in a gene toconsider that gene in the\n analysis. This is used to prevent a high number of\n outlier values in 1 sample from driving a low qvalue.\n Default 0.3\n --write_comparison_summaries\n Use flag to write a separate file for each column in\n the annotations table, with outlier counts in each\n group, p-values and q-values in each group.\n --fdr FDR FDR threshold to use to select genes to visualize.\n Default 0.05\n --write_gene_list Use flag to write a list of significantly enriched\n genes for each value in each comparison.\n --make_heatmaps Use flag to draw a heatmap of significantly enriched\n genes for each value in each comparison. If used, need\n an fdr threshold as well.\n --red_or_blue {red,blue}\n Color of values to draw on heatmap. Default red.\n --annotation_colors ANNOTATION_COLORS\n File with color map to use for annotation header. Must\n have a line with 'value color' format for each value\n in annotations. Any value not represented will be\n assigned a new color.\n```\n\nFor finding the value differences that cannot be explained by a different data level. For example\n, this can be used to find out variation due to differential phosphorylation (phospho as\n target_values) not due to protein abundance variation (protein as normalizer_values). \n**Warning: Row IDs between the two tables must match** \n```bash\nusage: blacksheep normalize [-h] [--ind_sep IND_SEP] [--output_prefix OUTPUT_PREFIX]\n target_values normalizer_values\n\nTakes a target table and a normalizer table, and returns a normalized target\ntable. Builds a regularized linear model for each line in the target table\nusing the matching row ID in the normalizer table, and finds the residuals of\nthat model for each value. for example, this could be used to normalize\nphospho-peptide data by protein abundance data; resulting values will reflect\nonly abundance differences due to phosphorylation changes, not peptide\nabundances. Another use could be normalizing RNA by CNA.\n\npositional arguments:\n target_values Table of values to be normalized. Sites/genes as rows,\n samples as columns. Row identifiers must be unique.\n normalizer_values Table of values to use for normalization. Sites/genes\n as rows, samples as columns. Row identifiers must be\n unique, and must match the pre-ind_sep part of the\n target values identifiers.\n\noptional arguments:\n -h, --help show this help message and exit\n --ind_sep IND_SEP Separator used in index if target is site specific.\n Row IDs before ind_sep in the target must match the\n row IDs in normalizer_values. If row IDs already\n match, leave blank.\n --output_prefix OUTPUT_PREFIX\n Prefix for output file. Suffix will be\n '.normalized.tsv'\n\n```\n\nFor a more thorough vignette, refer to our [supplementary notebooks](https://github.com/ruggleslab/blacksheep_supp)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ruggleslab/blackSheep/", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "blksheep", "package_url": "https://pypi.org/project/blksheep/", "platform": "", "project_url": "https://pypi.org/project/blksheep/", "project_urls": { "Homepage": "https://github.com/ruggleslab/blackSheep/" }, "release_url": "https://pypi.org/project/blksheep/0.0.7/", "requires_dist": [ "matplotlib (>=3.3.4)", "numpy (>=1.20.1)", "pandas (>=1.2.2)", "scipy (>=1.6.0)", "seaborn (>=0.11.1)", "statsmodels (>=0.12.2)" ], "requires_python": "", "summary": "A package for differential extreme values analysis", "version": "0.0.7", "yanked": false, "yanked_reason": null }, "last_serial": 9443721, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "f6d034cea7b6799f6de994625732b44f", "sha256": "ff4c9827c0f59c64991afb84518311bf32de487c3dfa77254142c4012c4b39b6" }, "downloads": -1, "filename": "blksheep-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "f6d034cea7b6799f6de994625732b44f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 28587, "upload_time": "2019-09-11T19:49:29", "upload_time_iso_8601": "2019-09-11T19:49:29.878814Z", "url": "https://files.pythonhosted.org/packages/61/b9/9438beb074d731c515954fc9541f8c95f139343f6d148c3d66f40054d4ba/blksheep-0.0.2-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "61ddc5a2b4d616e088f75b9a5544b5a8", "sha256": "226c532962d8c4e57f27094d6ab282aeba44dd7b169c8905006070c7b36fa130" }, "downloads": -1, "filename": "blksheep-0.0.2.tar.gz", "has_sig": false, "md5_digest": "61ddc5a2b4d616e088f75b9a5544b5a8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26680, "upload_time": "2019-09-11T19:49:32", "upload_time_iso_8601": "2019-09-11T19:49:32.057847Z", "url": "https://files.pythonhosted.org/packages/02/6d/a630c556c97777ac735b9f819acc6d95cb0393b643f3af6e6680b9103980/blksheep-0.0.2.tar.gz", "yanked": false, "yanked_reason": null } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "14a77dd0dc0f034ca1aca27b1bb09af6", "sha256": "937c9f14e06798cee95d8928b196aa4b5eb243e499522ae02c30e5eedaa18a07" }, "downloads": -1, "filename": "blksheep-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "14a77dd0dc0f034ca1aca27b1bb09af6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 28571, "upload_time": "2019-09-18T14:47:26", "upload_time_iso_8601": "2019-09-18T14:47:26.570778Z", "url": "https://files.pythonhosted.org/packages/f9/de/94a1907b71aa38179b0583f3a9947cd9f515eba22ca96c59e6b27a062f8a/blksheep-0.0.3-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "4fc15a55c7dc6b84eff0d3678f69faab", "sha256": "4f53d2066449bc766c90dd067f0f63ed3f4d1c6da815392c0385c614547f251d" }, "downloads": -1, "filename": "blksheep-0.0.3.tar.gz", "has_sig": false, "md5_digest": "4fc15a55c7dc6b84eff0d3678f69faab", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26683, "upload_time": "2019-09-18T14:47:29", "upload_time_iso_8601": "2019-09-18T14:47:29.367589Z", "url": "https://files.pythonhosted.org/packages/c5/93/d3a74b05aa2d6024c3d5ae2de44f186b830159f293fb78478ba41be622c8/blksheep-0.0.3.tar.gz", "yanked": false, "yanked_reason": null } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "06f0c3f80eb571d547ab44dc8966c3b1", "sha256": "451b7ec62b1c36a62d9ae48fce1086753f4245694654051b358cf3ccc1ab6bf1" }, "downloads": -1, "filename": "blksheep-0.0.5-py3-none-any.whl", "has_sig": false, "md5_digest": "06f0c3f80eb571d547ab44dc8966c3b1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 29289, "upload_time": "2019-10-23T15:33:51", "upload_time_iso_8601": "2019-10-23T15:33:51.289541Z", "url": "https://files.pythonhosted.org/packages/fc/17/9b8fe28b66e7d5845203312b52b5fcec14113570e245cb9667b143c3adf7/blksheep-0.0.5-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "70d8802705fa33c4a692a96c827677e6", "sha256": "0212677d5b5dbcbd3da3f755f84fd85a8c1d2bc0c9c28431240b8f65a1c7c5a4" }, "downloads": -1, "filename": "blksheep-0.0.5.tar.gz", "has_sig": false, "md5_digest": "70d8802705fa33c4a692a96c827677e6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27475, "upload_time": "2019-10-23T15:33:52", "upload_time_iso_8601": "2019-10-23T15:33:52.650834Z", "url": "https://files.pythonhosted.org/packages/9c/f8/10cebeaddee045fbdacda0b1213284f886a8fe9c4534db0bff30379f2a08/blksheep-0.0.5.tar.gz", "yanked": false, "yanked_reason": null } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "69373d3e6d4a5ea0fbaa07eea520d29b", "sha256": "62dd9396d205433f13afb53d16b2baa1206e718ef88e14688384093770748764" }, "downloads": -1, "filename": "blksheep-0.0.6-py3-none-any.whl", "has_sig": false, "md5_digest": "69373d3e6d4a5ea0fbaa07eea520d29b", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 30209, "upload_time": "2019-10-24T22:23:50", "upload_time_iso_8601": "2019-10-24T22:23:50.043720Z", "url": "https://files.pythonhosted.org/packages/23/ca/931d7e6249289e5473f6831e37f7a9ff39fca239739484c4d08e61680a6d/blksheep-0.0.6-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "7496801e94c9777a6367767b4e5ae8e9", "sha256": "a7c2abc9d7eec5d13c8c44075a6f30ff1ba3b77191c19748d7e14e28718d1469" }, "downloads": -1, "filename": "blksheep-0.0.6.tar.gz", "has_sig": false, "md5_digest": "7496801e94c9777a6367767b4e5ae8e9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 28356, "upload_time": "2019-10-24T22:23:51", "upload_time_iso_8601": "2019-10-24T22:23:51.415068Z", "url": "https://files.pythonhosted.org/packages/f5/00/98d684de1751356d899c0493af1b5f5a73c7917e2b50a0e110f2719764f6/blksheep-0.0.6.tar.gz", "yanked": false, "yanked_reason": null } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "0292909c88ec5d97953486c8aadfe671", "sha256": "10612b418b20dd89491f22798adea253968060e3b820cb6bfdbb131abf9ad3fa" }, "downloads": -1, "filename": "blksheep-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "0292909c88ec5d97953486c8aadfe671", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 30204, "upload_time": "2021-02-17T13:53:11", "upload_time_iso_8601": "2021-02-17T13:53:11.499901Z", "url": "https://files.pythonhosted.org/packages/c2/fc/e39bfd4af4ba918bbfceace29ec293aba523c1739b129606fd18e882089d/blksheep-0.0.7-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "062e840d6d9fb7ee3cd59565fd1ca36b", "sha256": "31ec77fab04a9ded792acb7acafd96190fa1c8c712b8c5a7519c39bf87f8c6ae" }, "downloads": -1, "filename": "blksheep-0.0.7.tar.gz", "has_sig": false, "md5_digest": "062e840d6d9fb7ee3cd59565fd1ca36b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 29249, "upload_time": "2021-02-17T13:53:12", "upload_time_iso_8601": "2021-02-17T13:53:12.977701Z", "url": "https://files.pythonhosted.org/packages/35/0d/28c5ce8a3a6f7cf4b4780b1a95d8848b094230da76095193a2ca71082c13/blksheep-0.0.7.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "0292909c88ec5d97953486c8aadfe671", "sha256": "10612b418b20dd89491f22798adea253968060e3b820cb6bfdbb131abf9ad3fa" }, "downloads": -1, "filename": "blksheep-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "0292909c88ec5d97953486c8aadfe671", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 30204, "upload_time": "2021-02-17T13:53:11", "upload_time_iso_8601": "2021-02-17T13:53:11.499901Z", "url": "https://files.pythonhosted.org/packages/c2/fc/e39bfd4af4ba918bbfceace29ec293aba523c1739b129606fd18e882089d/blksheep-0.0.7-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "062e840d6d9fb7ee3cd59565fd1ca36b", "sha256": "31ec77fab04a9ded792acb7acafd96190fa1c8c712b8c5a7519c39bf87f8c6ae" }, "downloads": -1, "filename": "blksheep-0.0.7.tar.gz", "has_sig": false, "md5_digest": "062e840d6d9fb7ee3cd59565fd1ca36b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 29249, "upload_time": "2021-02-17T13:53:12", "upload_time_iso_8601": "2021-02-17T13:53:12.977701Z", "url": "https://files.pythonhosted.org/packages/35/0d/28c5ce8a3a6f7cf4b4780b1a95d8848b094230da76095193a2ca71082c13/blksheep-0.0.7.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }