{ "info": { "author": "Chichau Miau", "author_email": "zmiao@ebi.ac.uk", "bugtrack_url": null, "classifiers": [], "description": "[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/sccaf/README.html)\n\n\n# SCCAF: Single Cell Clustering Assessment Framework\n\nSingle Cell Clustering Assessment Framework (SCCAF) is a novel method for automated identification of putative cell types from single cell RNA-seq (scRNA-seq) data. By iteratively applying clustering and a machine learning approach to gene expression profiles of a given set of cells, SCCAF simultaneously identifies distinct cell groups and a weighted list of feature genes for each group. The feature genes, which are overexpressed in the particular cell group, jointly discriminate the given cell group from other cells. Each such group of cells corresponds to a putative cell type or state, characterised by the feature genes as markers.\n\n# Requirements\n\nThis package requirements vary depending on the way that you want to install it (all three are independent, you don't need all these requirements):\n\n- pip: if installation goes through pip, you will require Python3 and pip3 installed.\n- Bioconda: if installation goes through Bioconda, you will require that [conda is installed and configured to use bioconda channels](https://bioconda.github.io/user/index.html).\n- Docker container: to use SCCAF from its docker container you will need [Docker](https://docs.docker.com/install/) installed.\n- Source code: to use and install from the source code directly, you will need to have git, Python3 and pip.\n\nThe tool depends on other Python/conda packages, but these are automatically resolved by the different installation methods.\n\nThe tool has been tested with the following versions:\n- conda: versions 4.7.5 and 4.7.10, but it should work with most other versions.\n- Docker: version 18.09.2, but should work with most other versions.\n- Python: versions 3.6.5 and 3.7. We don't expect this to work with Python 2.x.\n- Pip3: version 9.0.3, but any version of pip3 should work.\n\nThis software doesn't require any non-standard hardware.\n\n# Installation\n\n## pip\n\nYou can install SCCAF with pip:\n\n```\npip install sccaf\n```\n\nInstallation time on laptop with 16 GB of RAM and academic (LAN) internet connection: <10 minutes.\n\n## Bioconda\n\nYou can install SCCAF with bioconda (please setup conda and the bioconda channel if you haven't first, as explained [here](https://bioconda.github.io/user/index.html)):\n\n```\nconda install sccaf\n```\n\nInstallation time on laptop with 16 GB of RAM and academic (LAN) internet connection: <5 minutes.\n\n## Available as a container\n\nYou can use the SCCAF tool already setup on a Docker container. You need to choose from the available tags [here](https://quay.io/repository/biocontainers/sccaf?tab=tags) and replace it in the call below where it says ``.\n\n```\ndocker pull quay.io/biocontainers/sccaf:\n```\n\n**Note:** Biocontainer's containers do not have a latest tag, as such a docker pull/run without defining the tag will fail. For instance, a valid call would be (for version 0.0.3):\n\n```\ndocker run -it quay.io/biocontainers/sccaf:0.0.7--py_0\n```\n\nInside the container, you can either use the Python interactive shell or the command line version (see below).\n\nInstallation (pull) time on laptop with 16 GB of RAM and academic (LAN) internet connection: ~10 minutes.\n\n## Use latest source code\n\nAlternatively, for the latest version, clone this repo and go into its directory, then execute `pip3 install .`:\n\n```\ngit clone https://github.com/SCCAF/sccaf\ncd sccaf\n# you might want to create a virtualenv for SCCAF before installing\npip3 install .\n```\n\nif your python environment is configured for python 3, then you should be able to replace python3 for just python (although pip3 needs to be kept). In time this will be simplified by a simple pip call.\n\nInstallation (pull) time on laptop with 16 GB of RAM and academic (LAN) internet connection: ~10 minutes.\n\n# Usage within Python environment\n\n## Use with pre-clustered `anndata` object in the [SCANPY](https://scanpy.readthedocs.io/en/stable/) package\n\nThe main method of SCCAF can be applied directly to an [anndata](https://anndata.readthedocs.io/en/stable/) (AnnData is the main data format used by [Scanpy](https://scanpy.readthedocs.io/en/stable/)) object in Python.\n\n**Before applying SCCAF, please make sure the doublets have been excluded and the batch effect has been effectively regressed.**\n\n## Assessment of the quality of a clustering\n\nGiven a clustering stored in an anndata object `adata` under the key `louvain`, we would like to understand the quality (discrimination between clusters) with SCCAF:\n\n```python\nfrom SCCAF import SCCAF_assessment, plot_roc\nimport scanpy as sc\n\nadata = sc.read(\"path-to-clusterised-and-umapped-anndata-file\")\ny_prob, y_pred, y_test, clf, cvsm, acc = SCCAF_assessment(adata.X, adata.obs['louvain'], n=100)\n```\n\nreturned accuracy is in the `acc` variable.\n\nThe ROC curve can be plotted:\n\n```python\nimport matplotlib.pyplot as plt\n\nplot_roc(y_prob, y_test, clf, cvsm=cvsm, acc=acc)\nplt.show()\n```\n\nHigher accuracy indicate better discrimination. And the ROC curve shows the problematic clusters.\n\n## Optimize an over-clustering\n\nGiven an over-clustered result, SCCAF optimize the clustering by merging the cell clusters that cannot be discriminated by machine learning. \n\n### Selecting the starting clustering\n\nThe selection of start clustering (or pre-clustering, which is an over-clustering) aims to find a clustering with only over-clustering but no under-clustering. To achieve this clustering, we suggest to combine well-established clustering (e.g., louvain clustering in SCANPY or K-means or SC3) with data visualization (tSNE). We can assume that all the discriminative cell clusters should be detectable in the tSNE plot. Then, we can find a clustering (e.g, louvain with a chosen resolution, 1.5 in the example case) that separates all the \"cell islands\" in the tSNE plot. To achieve a higher speed, we also suggest to have as few cell cluster as possible. For example, if both resolution 1.5 and resolution 2.0 do not include under-clustering, we suggest to use resolution 1.5 result as the start clustering.\n\n```python\n\n# The batch effect MUST be regressed before applying SCCAF\nadata = sc.read(\"path-to-clusterised-and-umapped-anndata-file\")\n\n# An initial over-clustering needs to be assigned in consistent with the prefix for the optimization.\n# i.e., the optimization prefix is `L2`, the starting point of the optimization of `%s_Round0`%prefix, which is `L2_Round0`.\n\nsc.tl.louvain(adata, resolution=1.5, key_added='L2_Round0')\n# i.e., we aim to achieve an accuracy >90% for the whole dataset, optimize based on the PCA space:\nSCCAF_optimize_all(ad=adata, plot=False, min_acc=0.9, prefix = 'L2', use='pca')\n```\n\nin the above run, all changes will be left on the `adata` anndata object and no plots\nwill be generated. If you want to see the plots (blocking the progress until you close them)\nthen remove the `plots=False`.\n\n\nWithin the anndata object, assignments of cells to clusters will be left in `adata.obs['_Round']`.\n\n# Notebook demo\n\nYou can find some demonstrative Jupyter Notebooks [here](https://github.com/SCCAF/sccaf/blob/develop/notebook/):\n\n- [Zeisel Mouse Cortex Demo](https://github.com/SCCAF/sccaf/blob/develop/notebook/Zeisel_Mouse_Cortex_Demo.ipynb)\n - Expected execution time on a 16 GB RAM standard laptop: ~15 minutes\n\n# Usage from the command line\n\nWe have added convenience methods to use from the command line argument in the shell.\nThis facilitate as well the inclusion in workflow systems.\n\n## Optimisation and general purpose usage\n\nGiven an annData dataset with louvain clustering pre-calculated (and batch corrected if needed):\n\n```bash\nsccaf -i --optimise --skip-assessment -s louvain -a 0.89 -c 8 --produce-rounds-summary\n```\n\nthis will leave the result in new file named `output.h5`, which could be set via `-o`. In the current setting this will\nproduce a file named `rounds.txt` with the name of all optimisation rounds left in the output. This file\nis used for later parallelisation (among different machines) of an assessment process to determine the step to choose\nas final clustering.\n\nTo understand all options, simply execute `sccaf --help`.\n\n## Parallel run of assessments\n\nOnce the optimisation has taken place, an strategy to choose the round to be used as final result is to observe the\ndistribution of accuracies for each on multiple iterations of the assessment process. How the process is distributed is\na matter of implementation of the local HPC or cloud system. Essentially, the process that can be repeated, per each round,\nis:\n\n```\nround=\nsccaf-asses -i output.h5 -o results/sccaf_assess_$round.txt --slot-for-existing-clustering $round --iterations 20 --cores 8\n```\n\nrunning the above for a number of different rounds will leave files in the `results` folder.\n\n### Merging parallel runs to produce plot\n\nOnce all assessment runs are done, the merging and plotting step can be run:\n\n```\nsccaf-assess-merger -i results -r rounds.txt -o rounds-acc-comparison-plot.png\n```\n\nThis will produce a result like this:\n![plot](https://user-images.githubusercontent.com/368478/66618625-a6c2fe80-ebd1-11e9-8355-ea762097c604.png)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "SCCAF", "package_url": "https://pypi.org/project/SCCAF/", "platform": "", "project_url": "https://pypi.org/project/SCCAF/", "project_urls": null, "release_url": "https://pypi.org/project/SCCAF/0.0.7.post1/", "requires_dist": [ "numpy", "pandas", "louvain", "scikit-learn", "psutil", "scanpy (==1.4.4)" ], "requires_python": "", "summary": "Single-Cell Clustering Assessment Framework", "version": "0.0.7.post1" }, "last_serial": 5995015, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "1e7d078b212f9c54401ae553170cca67", "sha256": "9ae49f6c632a23761a5025225b42f93fa1303ae68c2c5b5d52292b190307eb14" }, "downloads": -1, "filename": "SCCAF-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "1e7d078b212f9c54401ae553170cca67", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 19391, "upload_time": "2019-07-12T15:31:52", "url": "https://files.pythonhosted.org/packages/25/0b/f797b330eb4b7ab8f956f86dba6c1078bc0369aed2c62d7e5fae8e0d9c92/SCCAF-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "106b3674e07712197d2f1ed5f4929a92", "sha256": "48074f17f3d481929a4eb593873c9874df673fe6ad0bcc182674a380e9bdb566" }, "downloads": -1, "filename": "SCCAF-0.0.2.tar.gz", "has_sig": false, "md5_digest": "106b3674e07712197d2f1ed5f4929a92", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20060, "upload_time": "2019-07-12T15:31:55", "url": "https://files.pythonhosted.org/packages/1c/46/2acbc0fe4533ad06186d8e5a24dbdb157cbbd50a13d9ee6a5dcb8ebb5be8/SCCAF-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "766c0dc70758b04db954599c731e88f8", "sha256": "ec78627cdb007938cf0f161f2d072ec675f11354f01cc47e540bfc907f20ae89" }, "downloads": -1, "filename": "SCCAF-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "766c0dc70758b04db954599c731e88f8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25613, "upload_time": "2019-08-09T17:07:31", "url": "https://files.pythonhosted.org/packages/9e/61/142b775d91766bdf2fddba5680bbf5a294b41fcce434b3afbd54c8edc39b/SCCAF-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1abef6f541fe04d326a82830ba530139", "sha256": "6d45dfba6ba611869f08c9448cbc5916bca63c0dba570d5ac04ec9868c46c5f5" }, "downloads": -1, "filename": "SCCAF-0.0.3.tar.gz", "has_sig": false, "md5_digest": "1abef6f541fe04d326a82830ba530139", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25356, "upload_time": "2019-08-09T17:07:33", "url": "https://files.pythonhosted.org/packages/7a/34/13491708d44f4f7ff59e72c4c05039098c9e1512bf3d11e5d739e4dc65cf/SCCAF-0.0.3.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "4b9163e1a7e9f453dae94194f9eb90c3", "sha256": "3998d0df3e7677700888d0cbab45b5fbdbe64fd00da923c705e63983ac249ba2" }, "downloads": -1, "filename": "SCCAF-0.0.5-py3-none-any.whl", "has_sig": false, "md5_digest": "4b9163e1a7e9f453dae94194f9eb90c3", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 24118, "upload_time": "2019-08-14T23:33:10", "url": "https://files.pythonhosted.org/packages/8d/df/a10249fa38f4ebe02ae0d7051d4b7bbfb63a4e76360ba735d78ff5a5b3f8/SCCAF-0.0.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1a191d1f023f5fa9bf8084944ce73802", "sha256": "3ec634408f0a6793714073cdf1bb676b2bcfcc9a44c60c14d7c70dfd0320aeeb" }, "downloads": -1, "filename": "SCCAF-0.0.5.tar.gz", "has_sig": false, "md5_digest": "1a191d1f023f5fa9bf8084944ce73802", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25104, "upload_time": "2019-08-14T23:33:13", "url": "https://files.pythonhosted.org/packages/80/c2/eefd73c65f0dccfdc41932b7b00c10fac7b13d7c3516552e8bee59c0cdd8/SCCAF-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "24c71b26119dd90c7492a09ba5af3d2b", "sha256": "dd5df0c6f1c523b1f9c0ba9461d21c2ef2d6bb703b39c215777bffd599aef66e" }, "downloads": -1, "filename": "SCCAF-0.0.6-py3-none-any.whl", "has_sig": false, "md5_digest": "24c71b26119dd90c7492a09ba5af3d2b", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 24716, "upload_time": "2019-10-08T21:39:30", "url": "https://files.pythonhosted.org/packages/0a/cc/bebb6fcadfa9ed9e6fcc32dad9f0f53cb0ef48482ac29a979fd8afaff37b/SCCAF-0.0.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "622e75ccc32742916c1a60b44b7ad5af", "sha256": "f8274346fc83cc42e73e21a4bbf38beb6e228d0b23f975c25b61858d3f8c732e" }, "downloads": -1, "filename": "SCCAF-0.0.6.tar.gz", "has_sig": false, "md5_digest": "622e75ccc32742916c1a60b44b7ad5af", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25619, "upload_time": "2019-10-08T21:39:35", "url": "https://files.pythonhosted.org/packages/c5/29/e443fe7b267f5225bedd55b06a91a3bcc883d9fda3516d45b1604fee8e4d/SCCAF-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "b0f98663080785af54e8d38a9e407767", "sha256": "782ca21265fb15122a09fbae6ffc04d6185b29855365fc8834c44b8c5e861c81" }, "downloads": -1, "filename": "SCCAF-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "b0f98663080785af54e8d38a9e407767", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25660, "upload_time": "2019-10-11T08:26:15", "url": "https://files.pythonhosted.org/packages/b9/c0/c82e57305ced60f69046ce63d4d61fb63943e629a48356be054d044d6e5e/SCCAF-0.0.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b252a6c74895fab35f04340fa9771873", "sha256": "93957db1aaf9720032a85eb625ae3e95e1299052f8a2e156341bdda9921ba30a" }, "downloads": -1, "filename": "SCCAF-0.0.7.tar.gz", "has_sig": false, "md5_digest": "b252a6c74895fab35f04340fa9771873", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26665, "upload_time": "2019-10-11T08:26:21", "url": "https://files.pythonhosted.org/packages/5b/6a/a125991c5bdc0773b264b8319334942b165adf174275dd8066b99997a07c/SCCAF-0.0.7.tar.gz" } ], "0.0.7.post1": [ { "comment_text": "", "digests": { "md5": "0d25edeb3bab403c1a4be61917a07a80", "sha256": "7a612c33cfbc31d3cf5d33de1d2d0a7ac2244fb50189de0ac45e21b2081aee9d" }, "downloads": -1, "filename": "SCCAF-0.0.7.post1-py3-none-any.whl", "has_sig": false, "md5_digest": "0d25edeb3bab403c1a4be61917a07a80", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25938, "upload_time": "2019-10-18T10:49:51", "url": "https://files.pythonhosted.org/packages/02/c0/2b85d8b65dbd0f361c1f961a0b576fd07c50c834bcd1e0b99f9c473a3184/SCCAF-0.0.7.post1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5a09cd2c35c17f3faa94a6e31582c95c", "sha256": "46e18c5ed00fe33af3a2057c3516437b21871943691d51f3d6015b625fe83a58" }, "downloads": -1, "filename": "SCCAF-0.0.7.post1.tar.gz", "has_sig": false, "md5_digest": "5a09cd2c35c17f3faa94a6e31582c95c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27056, "upload_time": "2019-10-18T10:49:57", "url": "https://files.pythonhosted.org/packages/8f/30/0a7c7719e45387f38fed4623ba143c7a9931465b1195bff14fd9d1de95b9/SCCAF-0.0.7.post1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "0d25edeb3bab403c1a4be61917a07a80", "sha256": "7a612c33cfbc31d3cf5d33de1d2d0a7ac2244fb50189de0ac45e21b2081aee9d" }, "downloads": -1, "filename": "SCCAF-0.0.7.post1-py3-none-any.whl", "has_sig": false, "md5_digest": "0d25edeb3bab403c1a4be61917a07a80", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25938, "upload_time": "2019-10-18T10:49:51", "url": "https://files.pythonhosted.org/packages/02/c0/2b85d8b65dbd0f361c1f961a0b576fd07c50c834bcd1e0b99f9c473a3184/SCCAF-0.0.7.post1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5a09cd2c35c17f3faa94a6e31582c95c", "sha256": "46e18c5ed00fe33af3a2057c3516437b21871943691d51f3d6015b625fe83a58" }, "downloads": -1, "filename": "SCCAF-0.0.7.post1.tar.gz", "has_sig": false, "md5_digest": "5a09cd2c35c17f3faa94a6e31582c95c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27056, "upload_time": "2019-10-18T10:49:57", "url": "https://files.pythonhosted.org/packages/8f/30/0a7c7719e45387f38fed4623ba143c7a9931465b1195bff14fd9d1de95b9/SCCAF-0.0.7.post1.tar.gz" } ] }