{ "info": { "author": "Kevin Arvai , Kyle Retterer , Carlos Borroto , Vlad Gainullin ", "author_email": "", "bugtrack_url": null, "classifiers": [], "description": "[![](https://img.shields.io/badge/python-3.6+-blue.svg)](https://www.python.org/downloads/release/python-360/)\n\n# phenopy\n`phenopy` is a Python package to perform phenotype similarity scoring by semantic similarity. `phenopy` is a \nlightweight but highly optimized command line tool and library to efficiently perform semantic similarity scoring on \ngeneric entities with phenotype annotations from the [Human Phenotype Ontology (HPO)](https://hpo.jax.org/app/).\n\n![Phenotype Similarity Clustering](https://raw.githubusercontent.com/GeneDx/phenopy/master/notebooks/output/cluster_three_diseases.png)\n\n## Installation\n### GitHub\nInstall from GitHub:\n```bash\ngit clone https://github.com/GeneDx/phenopy.git\ncd phenopy\npython setup.py install\n```\n\n## Command Line Usage\n### Initial setup\nphenopy is designed to run with minimal setup from the user, to run phenopy with default parameters (recommended), skip ahead \nto the [Commands overview](#Commands-overview). \n\nThis section provides details about where phenopy stores data resources and config files. The following occurs when\nyou run phenopy for the first time.\n 1. phenopy creates a `.phenopy/` directory in your home folder and downloads external resources from HPO into the\n `$HOME/.phenopy/data/` directory.\n 2. phenopy stores a binary version of the HPO as a [networkx](https://networkx.github.io/documentation/stable/reference/classes/multidigraph.html) \n graph object here: `$HOME/.phenopy/data/hpo_network.pickle`.\n 3. phenopy creates a `$HOME/.phenopy/phenopy.ini` config file where users can set variables for phenopy to use\n at runtime.\n\n### Commands overview\n`phenopy` is primarily used as a command line tool. An entity, as described here, is presented as a sample, gene, or \ndisease, but could be any concept that warrants annotation of phenotype terms. \n\n1. Score similarity of an entity defined by the HPO terms from an input file against all the genes in \n`.phenopy/data/phenotype_to_genes.txt`. We provide a test input file in the repo.\n ```bash\n phenopy score tests/data/test.score.txt\n ```\n Output:\n ```\n #query\tgene\tscore\n SAMPLE\tNCBI:10000[AKT3]\t0.0252\n SAMPLE\tNCBI:10002[NR2E3]\t0.0148\n SAMPLE\tNCBI:100033413[SNORD116-1]\t0.0283\n ...\n ```\n\n2. Score similarity of an entity defined by the HPO terms from an input file against a custom list of entities with HPO annotations, referred to as the `--records-file`.\n ```bash\n phenopy score tests/data/test.score.txt --records-file tests/data/test.score-product.txt\n ```\n Output:\n ```\n #query\tentity_id\tscore\n SAMPLE\t118200\t0.0584\n SAMPLE\t118210\t0.057\n SAMPLE\t118220\t0.0563\n ...\n ```\n\n3. Score pairwise similarity of entities defined in the `--records-file`.\n ```bash\n phenopy score-product tests/data/test.score-product.txt --threads 4\n ```\n Output:\n ```\n 118200\t118200\t0.7692\n 118200\t118300\t0.5345\n 118200\t300905\t0.2647\n ...\n ```\n\n## Parameters\nFor a full list of command arguments use `phenopy [subcommand] --help`:\n```bash\nphenopy score --help\n```\nOutput:\n```\n --records_file=RECORDS_FILE\n One record per line, tab delimited. First column record unique identifier, second column pipe separated list of HPO identifier (HP:0000001).\n --query_name=QUERY_NAME\n Unique identifier for the query file.\n --obo_file=OBO_FILE\n OBO file from https://hpo.jax.org/app/download/ontology.\n --pheno2genes_file=PHENO2GENES_FILE\n Phenotypes to genes from https://hpo.jax.org/app/download/annotation.\n --threads=THREADS\n Number of parallel process to use.\n --agg_score=AGG_SCORE\n The aggregation method to use for summarizing the similarity matrix between two term sets Must be one of {'BMA', 'maximum'}\n --no_parents=NO_PARENTS\n If provided, scoring is done by only using the most informative nodes. All parent nodes are removed.\n --hpo_network_file=HPO_NETWORK_FILE\n If provided, phenopy will try to load a cached hpo_network obejct from file.\n --custom_annotations_file=CUSTOM_ANNOTATIONS_FILE\n A comma-separated list of custom annotation files in the same format as tests/data/test.score-product.txt\n --output_file=OUTPUT_FILE\n filepath where to store the results. \n```\n## Library Usage\nThe `phenopy` library can be used as a `Python` module, allowing more control for advanced users. \n\n```python\nimport os\nfrom phenopy import config\nfrom phenopy.obo import restore\nfrom phenopy.score import Scorer\n\nnetwork_file = os.path.join(config.data_directory, 'hpo_network.pickle')\n\nhpo = restore(network_file)\nscorer = Scorer(hpo)\n\nterms_a = ['HP:0001882', 'HP:0011839']\nterms_b = ['HP:0001263', 'HP:0000252']\n\nprint(scorer.score(terms_a, terms_b))\n```\nOutput:\n```\n0.0005\n```\n\nAnother example is to use the library to prune parent phenotypes from the `phenotype_to_genes.txt`\n```python\nimport os\nfrom phenopy import config\nfrom phenopy.obo import restore\nfrom phenopy.util import export_pheno2genes_with_no_parents\n\n\nnetwork_file = os.path.join(config.data_directory, 'hpo_network.pickle')\nphenotype_to_genes_file = os.path.join(config.data_directory, 'phenotype_to_genes.txt')\nphenotype_to_genes_no_parents_file = os.path.join(config.data_directory, 'phenotype_to_genes_no_parents.txt')\n\nhpo = restore(network_file)\nexport_pheno2genes_with_no_parents(phenotype_to_genes_file, phenotype_to_genes_no_parents_file, hpo)\n```\n\n### Config\nWhile we recommend using the default settings for most users, the config file *can be* modified: `$HOME/.phenopy/phenopy.ini`.\n\n**IMPORTANT NOTE: \nIf the config variable `hpo_network_file` is defined, phenopy will try to load this stored version of the HPO and ignore \nthe following command-line arguments: `obo_file` and `custom_annotations_file`.**\n\nTo run phenopy with different `obo_file` or `custom_annotations_file`: \nRename or move the HPO network file: `mv $HOME/.phenopy/data/hpo_network.pickle $HOME/.phenopy/data/hpo_network.old.pickle`\n\nTo run phenopy with a previously stored version of the HPO network, simply set \n`hpo_network_file = /path/to/hpo_network.pickle`. \n\n## Contributing\nWe welcome contributions from the community. Please follow these steps to setup a local development environment. \n```bash\npipenv install --dev\n```\n\nTo run tests locally:\n```bash\npipenv shell\ncoverage run --source=. -m unittest discover --start-directory tests/\ncoverage report -m\n``` \n\n## References\nThe underlying algorithm which determines the semantic similarity for any two HPO terms is based on an implementation of HRSS, [published here](https://www.ncbi.nlm.nih.gov/pubmed/23741529).\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "phenopy", "package_url": "https://pypi.org/project/phenopy/", "platform": "", "project_url": "https://pypi.org/project/phenopy/", "project_urls": null, "release_url": "https://pypi.org/project/phenopy/0.2.1/", "requires_dist": [ "fire", "networkx", "numpy", "obonet", "pandas" ], "requires_python": "", "summary": "Phenotype comparison scoring by semantic similarity.", "version": "0.2.1" }, "last_serial": 5809792, "releases": { "0.2.1": [ { "comment_text": "", "digests": { "md5": "8a88c89070b7d95affa2849dcd4aca5c", "sha256": "20940378938674bfff6b7015151052d02ac3affdf86032f64e0a0ea6e5698e7a" }, "downloads": -1, "filename": "phenopy-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "8a88c89070b7d95affa2849dcd4aca5c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25726, "upload_time": "2019-09-10T15:56:01", "url": "https://files.pythonhosted.org/packages/63/5f/cb0157235f05c82d63e540ef5f08e2ab868d29c6b1d8f837e2dc141c6fe0/phenopy-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "aa1769320134c7bfa191af8816905e47", "sha256": "6c0b9044a1190b8ad6abbe0eec2fde36c3343303130347f40642348bc1d9f039" }, "downloads": -1, "filename": "phenopy-0.2.1.tar.gz", "has_sig": false, "md5_digest": "aa1769320134c7bfa191af8816905e47", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16451, "upload_time": "2019-09-10T15:56:04", "url": "https://files.pythonhosted.org/packages/bb/af/95861bcf98efee21649e8bd7c4e284672d26cbd2823e48a0785989a6f8a1/phenopy-0.2.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8a88c89070b7d95affa2849dcd4aca5c", "sha256": "20940378938674bfff6b7015151052d02ac3affdf86032f64e0a0ea6e5698e7a" }, "downloads": -1, "filename": "phenopy-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "8a88c89070b7d95affa2849dcd4aca5c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25726, "upload_time": "2019-09-10T15:56:01", "url": "https://files.pythonhosted.org/packages/63/5f/cb0157235f05c82d63e540ef5f08e2ab868d29c6b1d8f837e2dc141c6fe0/phenopy-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "aa1769320134c7bfa191af8816905e47", "sha256": "6c0b9044a1190b8ad6abbe0eec2fde36c3343303130347f40642348bc1d9f039" }, "downloads": -1, "filename": "phenopy-0.2.1.tar.gz", "has_sig": false, "md5_digest": "aa1769320134c7bfa191af8816905e47", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16451, "upload_time": "2019-09-10T15:56:04", "url": "https://files.pythonhosted.org/packages/bb/af/95861bcf98efee21649e8bd7c4e284672d26cbd2823e48a0785989a6f8a1/phenopy-0.2.1.tar.gz" } ] }