{
"info": {
"author": "FEI YUAN",
"author_email": "yuanfeifuzzy@gmail.com",
"bugtrack_url": null,
"classifiers": [
"Development Status :: 3 - Alpha",
"Environment :: Console",
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: MIT License",
"Natural Language :: English",
"Programming Language :: Python :: 3.4",
"Topic :: Scientific/Engineering :: Bio-Informatics"
],
"description": ".. _intro-overview:\n\nProtParCon\n==========\n\nProtParCon is an application framework for manipulating molecular data and\nidentifying parallel and convergent amino acid replacements at the\nmolecular level. Although ProtParCon was not designed for implementing new\nmethods or algorithms for molecular data manipulation, ProtParCon integrates\nseveral widely used programs for multiple sequence alignment (MSA),\nancestral states reconstruction (ASR), protein sequence simulation,\nMaximum-Likelihood tree inference (ML Tree) and molecular convergence\nidentification. Therefore, it can be used as a general tool to do MSA,\nASR, and simulation under a common interface by using various\npre-existed programs under hood.\n\nWork-Flow of ProtParCon\n=======================\n\nProtParCon processes a set of orthologous protein sequences with known\nphylogenetic relationships in six stages:\n\n- MSA\n- ASR\n- IDENTIFY\n- SIM\n- IDENTIFY,\n- TEST\n\n.. figure:: https://www.mdpi.com/genes/genes-10-00181/article_deploy/html/images/genes-10-00181-g001.png\n :alt: Overview of the ProtParCon analytical scheme\n :align: center\n\n Overview of the ProtParCon analytical scheme\n\nDuring the multiple sequence alignment (MSA) stage, protein sequences are\naligned while gaps and ambiguous character states are trimmed.\n\nIn the ancestral state reconstruction (ASR) stage, ancestral character states at\neach site are inferred for each internal node in the reconstructed tree.\n\nObserved parallel and convergent amino acid replacements for pairs of branches\nare identified in the IDENTIFY stage. Parallel replacements are denoted by P\n(red) and convergent replacements by C (blue).\n\nSimulations are conducted in the SIM (simulation) stage. Simulated sequences\nare evolved according to the following parameters:\n\na. an evolutionary model (a replacement rate matrix)\nb. the branching pattern and branch lengths of the tree estimated in the ASR\n stage\nc. amino acid frequencies and sequence length estimated from the trimmed\n alignment.\n\nExpected parallel and convergent replacements are identified after the SIM\nstage or they are directly calculated if no simulation is conducted.\n\nThe differences between numbers of observed and expected parallel and\nconvergent replacements for branch pairs of interest are tested during the\nTEST stage. For better readability, only part of simulated sequences and\ndetailed P&C data are shown. TSV (Tab Separated Values) format data are\nreformatted. Notation of branch pair, A-B, means a branch pair involving two\nbranches that are leading to A and B, respectively. R1 and R2 represent two\namino acid replacement events along two branches. The standard one-letter\nabbreviations for amino acids is used for the replacements.\n\nWalk-through of an example\n==========================\n\nIn order to show you what ProtParCon brings to the table, we'll walk you through\nan example using the simplest way to identify parallel and convergent amino\nacid replacements at the protein sequence level.\n\nHere is the code for ProtParCon identifying parallel and convergent amino acid\nreplacements within an orthologous protein::\n\n from ProtParCon import imc\n\n sequence = 'path/to/the/orthologous/protein/sequence'\n tree = 'path/to/the/phylogenetic/tree'\n muscle = 'path/to/the/executable/of/muscle/alignment/program'\n codeml = 'path/to/the/executable/of/codeml/program'\n evolver = 'path/to/the/executable/of/evolver/program'\n\n imc(sequence, tree, aligner=muscle, ancestor=codeml, simulator=evolver)\n\n\nPut the above code in a text file, name it something like `imc_analyze.py`\nand run the script using Python in a terminal::\n\n $ python imc_analyze.py\n\n\nWait for this to finish you will have six files in your work directory:\n`msa.fa`, `trimmed.msa.fa`, `ancestors.tsv`, `simulations.tsv`,\n`imc.counts.tsv`, and `imc.details.tsv`. From their names, you may already know\nwhat contents in these files. The `imc.counts.tsv` contains the number of\nparallel and convergent amino acid replacements that have been identified among\nall comparable branches, and it looks like this (reformatted here for better\nreadability)::\n\n Category BranchPair OBS SIM-1 SIM-2 SIM-3 SIM-4 SIM-5\n P A-B 0 0 1 1 0 1\n P A-NODE10 0 0 0 0 0 0\n P A-NODE11 3 2 1 0 3 2\n P A-NODE13 0 2 0 1 0 2\n P A-E 0 0 0 0 0 0\n C A-B 0 2 1 2 2 0\n C A-NODE10 0 0 0 0 0 0\n C A-NODE11 0 0 0 1 1 0\n C A-NODE13 0 0 1 2 0 1\n\nThe `imc.details.tsv` contains the details of parallel and convergent amino\nacid replacements that have been identified, e.g. replacement occurred between\nwhich branch pairs, on which position of the protein sequence, what kind of\nreplacement, and so on.\n\n\nWhat just happened?\n===================\n\nWhen you run the script, ``imc()`` look for a sequence file and pass it\nto a multiple sequence alignment program (`MUSCLE `_\nprogram was used in this example), after done with the sequence alignment,\n``imc()`` look for a phylogenetic tree file and pass it along with the\nalignment (already removed all gaps and ambiguous characters) to a ancestral\nsequence reconstruction program (``CODEML`` program inside\n`PAML `_ package is used) to\ninfer the ancestral states. Since a simulation program (``EVOLVER`` program\ninside PAML package) is specified via argument simulator, ProtParCon will\nautomatically prepare all files needed by evolver and then use evolver to\nconduct sequence simulation. Once ``imc()`` all these works are done, it will\nstart to identify parallel and convergent amino acid replacements along the\nprotein sequence and finally save the results to text files.\n\nHere you notice that one of the main advantages about ProtParCon: sequence\nalignment, ancestral states reconstruction, and sequence simulation are\nautomatically done without users calling each program step\nby step. This means ProtParCon already have a pipeline that chained all these\nprocesses together, users are only required to tell ProtParCon how they want\nthe sequence to be handled and what results they want to get. Another\nadvantage of using ProtParCon is that it provides a common interface for all\nsupported programs, users no longer need to learn how to use the program and\nhandle the results of these programs.\n\nWhile ProtParCon enables users to do very fast parallel and convergent amino\nacid replacement identifications (by use a single sequence file and a tree file)\n, ProtParCon also gives users full control of the identification process through\nexplicitly manage the workflow step by step. Users are able to do things like\nchoosing preferred sequence alignment program to get high quality sequence\nalignment, passing more parameters to ancestral states reconstruction program\nto get accurate ancestral states, and getting full control of sequence\nsimulation process by explicitly using the simulation module with additional\noptions.\n\n\nWhat else?\n==========\n\nYou've seen how to run fast parallel and convergent amino acid replacement\nidentifications using general function ``imc()`` in ProtParCon package, but this\nis just the surface. ProtParCon provides a lot of powerful features for\nmanipulating molecular data and makes parallelism and convergence\nidentification even phylogenetic analysis much easier and more efficient,\nsuch as:\n\n* Built-in support for a lot of sequence alignment programs for multiple\n sequence alignment (MSA) using simple function.\n\n* Built-in support for a lot of phylogenetic tree inference programs for\n inferring best maximum likelihood tree using simple function.\n\n* Built-in support for a lot of ancestral states reconstruction programs for\n ancestral states reconstruction (ASR) using simple function.\n\n* Built-in support for a lot of sequence simulation programs for simulating\n sequences under various evolutionary scenarios using simple function.\n\n* Built-in support for identifying parallel and convergent amino acid\n replacements using raw orthologous sequence, multiple sequence alignment,\n reconstructed ancestral sequences, or even simulated sequences.\n\n\nWhat's next?\n============\n\nThe next steps for you to do: install ProtParCon, follow through the pre-made\nexamples to learn how to unleash the full power of ProtParCon, use ProtParCon\nin your routine work to ease the process of molecular data manipulation and\nmolecular parallelism and convergence identification, and finally extend\nProtParCon to make it support more and more programs if you are interested in\nProtParCon. Thanks for you interest!\n\n\nSee the full description and `documentation`_ of ProtParCon for more details!\n\n.. _documentation: https://ibiology.github.io/ProtParCon/\n\n\n",
"description_content_type": "",
"docs_url": null,
"download_url": "",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://github.com/iBiology/ProtParCon",
"keywords": "phylogeny tree alignment simulation biology bioinformatics",
"license": "MIT",
"maintainer": "",
"maintainer_email": "",
"name": "ProtParCon",
"package_url": "https://pypi.org/project/ProtParCon/",
"platform": "",
"project_url": "https://pypi.org/project/ProtParCon/",
"project_urls": {
"Homepage": "https://github.com/iBiology/ProtParCon"
},
"release_url": "https://pypi.org/project/ProtParCon/3.1/",
"requires_dist": [
"biopython (>=1.71)"
],
"requires_python": "",
"summary": "ProtParCon - A framework for framework for processing molecular data and identifying parallel and convergent amino acid replacements.",
"version": "3.1"
},
"last_serial": 4919997,
"releases": {
"1.0": [
{
"comment_text": "",
"digests": {
"md5": "3268b72f49d72a6fccc4589e632e62f7",
"sha256": "bcfb8c7bdb288b51e89fe19a455bb9c8c79c3b96410d1a41afb263197cd2e0c8"
},
"downloads": -1,
"filename": "ProtParCon-1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3268b72f49d72a6fccc4589e632e62f7",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 57252,
"upload_time": "2018-11-13T16:42:34",
"url": "https://files.pythonhosted.org/packages/1f/7f/cc148fc3a6b7484398b917790db0a4cc8e4f4023fe4baf49ed5fba7815e5/ProtParCon-1.0-py3-none-any.whl"
},
{
"comment_text": "",
"digests": {
"md5": "43bcb67f68c7ef55be26d00a46317280",
"sha256": "e392982b038a4c275b328027b5da7a6b7352fc933aac7a428a2795e569f11f3c"
},
"downloads": -1,
"filename": "ProtParCon-1.0.tar.gz",
"has_sig": false,
"md5_digest": "43bcb67f68c7ef55be26d00a46317280",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 62946,
"upload_time": "2018-11-13T16:42:36",
"url": "https://files.pythonhosted.org/packages/89/31/0d9aa2e1b95943a42f6e2f109112af13cf2922ae80b6ff971766fec8da88/ProtParCon-1.0.tar.gz"
}
],
"2.0": [
{
"comment_text": "",
"digests": {
"md5": "53608cbf5c9810ac932ef7be26853609",
"sha256": "e68969039307efa5244df7a052936174dd61209a61f323d5fdf2fe008f3ae3f0"
},
"downloads": -1,
"filename": "ProtParCon-2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "53608cbf5c9810ac932ef7be26853609",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 308881,
"upload_time": "2019-03-06T18:29:55",
"url": "https://files.pythonhosted.org/packages/43/13/f0f5e645ff47ba3a54bd4da60d281640a7e3a91af0d22baad1e377c25101/ProtParCon-2.0-py3-none-any.whl"
},
{
"comment_text": "",
"digests": {
"md5": "b950f12566aeaf37f6af1fc04e26b1ce",
"sha256": "adfb6416c4c89ca3c54d7c8618afed7209d749d640ec2872896706c1843355b6"
},
"downloads": -1,
"filename": "ProtParCon-2.0.tar.gz",
"has_sig": false,
"md5_digest": "b950f12566aeaf37f6af1fc04e26b1ce",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 99084,
"upload_time": "2019-03-06T18:29:57",
"url": "https://files.pythonhosted.org/packages/a7/2c/3cb1738f8098f0809bab0d3ba91c01c6a477fb5a1fdf1ff2394b5b9f2c60/ProtParCon-2.0.tar.gz"
}
],
"3.0": [
{
"comment_text": "",
"digests": {
"md5": "3f5448d03153ddc65e35c0aba245e3a2",
"sha256": "f9654107ef999c65ae2b7a1b925aaef84e5c95f02a0d1610d8c44c01dbdb45b4"
},
"downloads": -1,
"filename": "ProtParCon-3.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3f5448d03153ddc65e35c0aba245e3a2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 116605,
"upload_time": "2019-03-10T00:01:14",
"url": "https://files.pythonhosted.org/packages/97/3a/889d51154765b4ac7798ef1eca7bc87c2579ddf97a57068836833bf41755/ProtParCon-3.0-py3-none-any.whl"
},
{
"comment_text": "",
"digests": {
"md5": "3612a46a000430e71de6b9f713ee19db",
"sha256": "c1dcab97039046671ba1e5bb643aaa3a3125c1e106af71e926a3b7eb71082950"
},
"downloads": -1,
"filename": "ProtParCon-3.0.tar.gz",
"has_sig": false,
"md5_digest": "3612a46a000430e71de6b9f713ee19db",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 101316,
"upload_time": "2019-03-10T00:01:15",
"url": "https://files.pythonhosted.org/packages/85/1f/c3d694724c710138b6daa0403555b346ff4d917acdfbba7863c33d45d509/ProtParCon-3.0.tar.gz"
}
],
"3.1": [
{
"comment_text": "",
"digests": {
"md5": "78d7cfc523e28075a81ce8caeb521b7f",
"sha256": "471389f5a5dbc06d893c0c946789e36b96027871c1a596e5c6ceb91acab1d8a2"
},
"downloads": -1,
"filename": "ProtParCon-3.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "78d7cfc523e28075a81ce8caeb521b7f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 116605,
"upload_time": "2019-03-10T00:19:48",
"url": "https://files.pythonhosted.org/packages/d3/b9/f07c927a5398e2170558d065ec173204cc119ef3a7709b584c15dddbb7da/ProtParCon-3.1-py3-none-any.whl"
},
{
"comment_text": "",
"digests": {
"md5": "9fe0ea03844f7df4e8c404bd19e2b3ce",
"sha256": "da6f8bdf8afbbf1db3861372af0b090961024282a92d4de020483d97bfbc1f79"
},
"downloads": -1,
"filename": "ProtParCon-3.1.tar.gz",
"has_sig": false,
"md5_digest": "9fe0ea03844f7df4e8c404bd19e2b3ce",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 101313,
"upload_time": "2019-03-10T00:19:50",
"url": "https://files.pythonhosted.org/packages/30/b2/c3040e39cce4511cf30ceccb4e25ac5fa30cb7d8449c673f8a9695ea4040/ProtParCon-3.1.tar.gz"
}
]
},
"urls": [
{
"comment_text": "",
"digests": {
"md5": "78d7cfc523e28075a81ce8caeb521b7f",
"sha256": "471389f5a5dbc06d893c0c946789e36b96027871c1a596e5c6ceb91acab1d8a2"
},
"downloads": -1,
"filename": "ProtParCon-3.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "78d7cfc523e28075a81ce8caeb521b7f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 116605,
"upload_time": "2019-03-10T00:19:48",
"url": "https://files.pythonhosted.org/packages/d3/b9/f07c927a5398e2170558d065ec173204cc119ef3a7709b584c15dddbb7da/ProtParCon-3.1-py3-none-any.whl"
},
{
"comment_text": "",
"digests": {
"md5": "9fe0ea03844f7df4e8c404bd19e2b3ce",
"sha256": "da6f8bdf8afbbf1db3861372af0b090961024282a92d4de020483d97bfbc1f79"
},
"downloads": -1,
"filename": "ProtParCon-3.1.tar.gz",
"has_sig": false,
"md5_digest": "9fe0ea03844f7df4e8c404bd19e2b3ce",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 101313,
"upload_time": "2019-03-10T00:19:50",
"url": "https://files.pythonhosted.org/packages/30/b2/c3040e39cce4511cf30ceccb4e25ac5fa30cb7d8449c673f8a9695ea4040/ProtParCon-3.1.tar.gz"
}
]
}