{ "info": { "author": "Andrew Perry & Bosco Ho", "author_email": "ajperry@pansapiens.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Environment :: Console", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "==========\ninmembrane\n==========\n\n*inmembrane* is a pipeline for proteome annotation to predict if a\nprotein is exposed on the surface of a bacteria. It orchestrates the \nanalysis of protein sequences to provides a summary of which targets may \nbe surface exposed based on predicted subcellular localization signals and \nmembrane topology. Currently protocols have been implemented for gram+ and\ngram- bacterial proteomes.\n\nTypical usage is via the script ``inmembrane_scan``, eg::\n\n $ inmembrane_scan mysequences.fasta\n\n\nThe provided sequences \n(in `FASTA format `_) \nare subjected to an number of sequence analyses using external\nprograms (see below) and the result summarized like::\n\n SPy_0008 CYTOPLASM(non-PSE) . SPy_0008 from AE004092\n SPy_0010 PSE-Membrane tmhmm(1) SPy_0010 from AE004092\n SPy_0012 PSE-Cellwall hmm(GW2|GW3|GW1);signalp SPy_0012 from AE004092\n SPy_0016 MEMBRANE(non-PSE) tmhmm(12) SPy_0016 from AE004092\n SPy_0019 SECRETED signalp SPy_0019 from AE004092\n\n\nAs well as output to stdout, this will generate a summary CSV file \n``mysequences.csv``and a directory ``mysequences`` containing output\nfiles generated during the run.\n\nAlthough *inmembrane* is primarily designed to be used as a stand alone\nprogram, it can also be used as a library like::\n\n import inmembrane\n params = inmembrane.get_params()\n params['fasta'] = \"input.fasta\"\n annotations = inmembrane.process(params)\n\nwhere ``annotations`` is a dictionary of the results, with protein sequence IDs as\nkeys.\n\nYou can also test the functionality of the analysis plugins\nthat are part of *inmembrane* by typing::\n\n $ inmembrane_scan --test\n\nThis can be useful for determining which binary dependences\nare correctly installed, or exposing any broken / offline web services\nrequired for a particular analysis.\n\nRunning under Docker\n====================\n\nDocker containers provide a convenient way to run the software in a more\nreproducible environment.\n\nTo create a Docker container and run tests::\n\n $ docker build -t inmembrane:latest .\n $ docker run -it inmembrane -t --skip-tests test_tmhmm,test_signalp4,test_lipop1,test_memsat3\n\nA `Dockerfile-memsat3` is also included that creates a container with MEMSAT3\nand the required Swissprot BLAST database, however you must accept the MEMSAT3\nlicense before using this (ie, no commercial use).\n\nTo run an analysis using the container::\n\n # Run once to get a template inmembrane.config in the current working directory\n $ docker run -it -v $(pwd):/data inmembrane\n\n # Edit inmembrane.config as required. Use signap_scrape_web, tmhmm_scrape_web and lipop_scrape_web\n # as the binary versions won't exist in the default container\n # Then, assuming my_proteome.fasta exists in the current working directory, run:\n $ docker run -it -v $(pwd):/data inmembrane my_proteome.fasta\n\nInstallation and Configuration\n==============================\n\nThe latest stable release of *inmembrane* can be installed via \npip, or the bleeding edge from Github.\n\nVia pip::\n\n $ sudo pip install inmembrane\n\nOr from Github::\n\n $ git clone http://github.com/boscoh/inmembrane.git\n $ cd inmembrane\n $ sudo python setup.py install\n\nThe package includes tests, examples, data files, docs.\nHMMER3 is the only *required* external dependency, however\nfor large analyses (multiple proteomes) it is suggested \nthat local versions of other analysis programs are installed \nrather than relying on web services (see Installing dependencies_ below).\n\nThe editable parameters of *inmembrane* are found in\n``inmembrane.config``, which is always located in the same\ndirectory as the main script. If no such file exists, a default\n``inmembrane.config`` will be generated. By default, you probably\ndon't need to change anything.\n\nThe parameters are:\n\n- the path location of the binaries for SignalP, LipoP, TMHMM,\n HMMSEARCH, and MEMSAT. This can be the full path, or just the\n binary name if it is on the system path environment. Use ``which``\n to check.\n- 'protocol' to indicate which analysis you want to use.\n Currently, we support:\n \n - ``gram_pos`` the analysis of surface-exposed proteins of Gram+\n bacteria;\n - ``gram_neg`` annotation of subcellular localization and inner\n membrane topology classification for Gram- bacteria\n\n- 'hmm\\_profiles\\_dir': the location of the HMMER profiles for any\n HMM peptide sequence motifs\n- for HMMER, you can set the cutoffs for significance, the E-value\n 'hmm\\_evalue\\_max', and the score 'hmm\\_score\\_min'\n- the shortest length of a loop that sticks out of the\n peptidoglycan layer of a Gram+ bacteria. The SurfG+ determined this\n to be 50 amino acids for terminal loops, and twice that for\n internal loops, 100\n- 'helix\\_programs' you can choose which of the\n transmembrane-helix prediction programs you want to use\n\nOutput format\n=============\n\nThe output of *inmembrane* ``gram_pos`` protocol consists of four\ncolumns of output. This is printed to stdout and written as a CSV\nfile, which can be opened in spreadsheet software such as EXCEL.\nThe standard text output can be parsed using space delimiters\n(empty fields in the third column are indicated with a \".\").\nLogging information are prefaced by a '#' character, and is sent to\nstderr.\n\nHere's an example::\n\n SPy_0008 CYTOPLASM(non-PSE) . SPy_0008 from AE004092\n SPy_0009 CYTOPLASM(non-PSE) . SPy_0009 from AE004092\n SPy_0010 PSE-Membrane tmhmm(1) SPy_0010 from AE004092\n SPy_0012 PSE-Cellwall hmm(GW2|GW3|GW1);signalp SPy_0012 from AE004092\n SPy_0013 PSE-Membrane tmhmm(1) SPy_0013 from AE004092\n SPy_0015 PSE-Membrane tmhmm(2) SPy_0015 from AE004092\n SPy_0016 MEMBRANE(non-PSE) tmhmm(12) SPy_0016 from AE004092\n SPy_0019 SECRETED signalp SPy_0019 from AE004092\n\n\n- the first column is the SeqID which is the first token in the\n identifier line of the sequence in the FASTA file\n\n- the second column is the prediction, it is CYTOPLASM(non-PSE),\n MEMBRANE(non-PSE), PSE-Cellwall, PSE-Membrane, PSE-Lipoprotein or\n SECRETED. Any 'PSE' (Potentially Surface Exposed) annotation means\n that based on the predicted topology, the protein is likely to be\n surface exposed and will be protease accessible in a\n membrane-shaving experiment.\n\n- the third line is a summary of features detected by external\n tools:\n \n - tmhmm(2) means 2 transmembrane helices were found by TMHMM\n - hmm(GW2\\|GW3\\|GW1) means that the GW1, GW2 and GW3 motifs were\n found by HMMER hmmsearch\n - signalp means a secretion signal was found SignalP\n - lipop means a Sp II secretion signal found by LipoP with an\n appropriate CYS residue at the cleavage site, which will be\n attached to a phospholipid in the membrane\n\n- the rest of the line gives the full identifier of the sequence\n in the FASTA file.\n\n.. _dependencies:\n\nInstalling dependencies\n=======================\n\nWhile *inmembrane* only requires a local installation of HMMER 3.0\nand can used web services for TMHMM, SignalP, LipoP and various\nOMP beta-barrel predictors, for large scale analyses (5000 sequences+)\nit is suggested that locally installed versions are used in the interest\nof speed, at to be polite to publicly available web services.\n\nWith each dependency, it is important that you have the exact version \nthat *inmembrane* is written to inter-operate with, otherwise *inmembrane*\nis likely to be unable to interpret the output of the downstream \nanalysis program.\n\nRequired dependencies, and their versions:\n\n- HMMER 3.0\n- TMHMM 2.0 *or* MEMSAT3\n- SignalP 4.1\n- LipoP 1.0\n\nThese instructions have been tailored for Debian-based systems, in\nparticular Ubuntu 11.10+. Each of these dependencies are licensed\nfree to academic users.\n\nHMMER 3.0\n---------\n\nOn Ubuntu (and other Debian-derived) Linux distributions::\n\n $ sudo apt-get install hmmer\n\nshould be enough.\n\nAlternatively:\n\n- Download HMMER 3.0 from http://hmmer.janelia.org/software.\n- The HMMER user guide describes how to install it. For the\n pre-compiled packages, this is as simple as putting the binaries on\n your PATH.\n\nTMHMM 2.0\n---------\n\nOnly one of TMHMM or MEMSAT3 are required, but users that want to\ncompare transmembrane segment predictions can install both.\n\n\n- Download and install TMHMM 2.0 from\n http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?tmhmm.\n- In the *bin/tmhmm* script, edit the *$opt_basedir* variable to point to\n the full path of where TMHMM is installed.\n\nSignalP 4.1\n-----------\n\n\n- Download SignalP 4.1\n http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?signalp. You will need\n to fill out the form with an institutional email address and accept\n the academic license. The software will be emailed to you.\n- Follow the installation instructions at\n http://www.cbs.dtu.dk/services/doc/signalp-4.1.readme.\n\nLipoP 1.0\n---------\n\n\n- Download LipoP 1.0 from\n http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?lipop. The\n installation proceedure is similar to that for SignalP.\n\nMEMSAT3\n-------\n\n\n- Download MEMSAT3 from\n http://bioinfadmin.cs.ucl.ac.uk/downloads/memsat/memsat3/memsat3.0.tar.gz\n- MEMSAT3 requires NCBI BLAST (\"legacy\" BLAST, not BLAST+) using\n the SwissProt (swissprot) database.\n- Legacy BLAST can be downloaded at\n ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/\n installed using the instructions provided by NCBI\n http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/unix_setup.html. We\n have tested with version 2.2.25.\n- You will need both the 'nr' database and the 'swissprot'\n database, since 'swissprot' is indexed against 'nr'. (The other\n option is to download the FASTA version of Uniprot/Swiss-Prot from\n ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz\n and create your own BLAST formatted database with using the BLAST\n formatdb tool).\n\n- Edit the *runmemsat* script included with MEMSAT3 to point to\n the correct locations using absolute paths:\n- 'dbname' is the location of your BLAST formatted swissprot\n database\n- 'ncbidir' is the base directory of your BLAST installation\n- 'execdir' is the path where the MEMSAT3 executable resides\n- 'datadir' is the the path to the MEMSAT3 data directory )\n\n\n(Note the the 'runmemsat' script refers to PSIPRED v2, but it means\nMEMSAT3 - PSIPRED is NOT required).\n\nPython libraries\n----------------\n\n*inmembrane* depends on the following Python libraries (\n`Beautiful Soup `_,\n`mechanize `_ and\n`twill `_, \n`Suds `_ and \n`Requests `_).\n\n``pip`` should handle installing these for you automatically.\n\nModification guide\n==================\n\nIt is a fact of life for bioinformatics that new versions of basic\ntools change output formats and APIs. We believe that it is an\nessential skill to rewrite parsers to handle the subtle but\nsignificant changes in different versions. We have written\n*inmembrane* to be easily modifiable and extensible. *Protocols*\nwhich embody a particular high level workflow are found in\n``inmembrane/protocols``.\n\nAll interaction with a specific external programs or web services have\nbeen wrapped into a single python *plugin* module, and placed in\nthe ``inmembrane/plugins`` directory. This contains the code to both run the\nprogram and to parse the output. We have tried to make the parsing\ncode as concise as possible. Specifically, by using the native\nPython dictionary, which allows an enormous amount of flexibility,\nwe can collate the results of various analyses with very little code.\n\nA more comprehensive overview can be found at http://boscoh.github.com/inmembrane/api.html.\n\ninmembrane development style guide:\n-----------------------------------\n\nHere are some guidelines for understanding and extending the code. \n\n- *Confidence:* Plugins that wrap an external program should have\n at least one high level test which is executed by run\\_tests.py.\n This allows new users to immediately determine if their\n dependencies are operating as expected.\n- *Interface:* A plugin that wraps an external program must\n receive a *params* data structure (derived from\n ``inmembrane.config``) and a *proteins* data structure (which is a\n dictionary keyed by sequence id). Plugins should return a\n 'proteins' object.\n- *Flexibility:* Plugins should have a 'force' boolean argument\n that will force the analysis to re-run and overwrite output files.\n- *Efficiency:* All plugins should write an output file which is\n read upon invocation to avoid the analysis being re-run.\n- *Documentation:* A plugin must have a Python docstring\n describing what it does, what parameters it requires in the\n ``params`` dictionary and what it adds to the ``proteins`` data\n structure. See the code for examples.\n- *Anal:* Code should follow PEP-8 (4 space indentation) unless there is a\n really really good reason.\n- *Anal:* Unique sequence ID strings (eg ``gi|1234567``) are\n called 'seqid'. 'name' is ambiguous. 'prot\\_id' is reasonable,\n however conceptually a 'protein' is not the same thing as a string\n that represents it's 'sequence' - hence the preference for 'seqid'.\n- *Anal:* All file handles should be closed when they are no\n longer needed.", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/boscoh/inmembrane", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "inmembrane", "package_url": "https://pypi.org/project/inmembrane/", "platform": "", "project_url": "https://pypi.org/project/inmembrane/", "project_urls": { "Homepage": "http://github.com/boscoh/inmembrane" }, "release_url": "https://pypi.org/project/inmembrane/0.95.0/", "requires_dist": null, "requires_python": "", "summary": "A bioinformatic pipeline for proteome annotation to predict if a protein is exposed on the surface of a bacteria.", "version": "0.95.0" }, "last_serial": 3516404, "releases": { "0.9-rc3": [ { "comment_text": "", "digests": { "md5": "1c576edfc8936e63b4d089a86ce0719f", "sha256": "235a9e1dd36762b97a8cad6936026389cc2f0c87d14496f17b6ec4b7af51d38e" }, "downloads": -1, "filename": "inmembrane-0.9-rc3.tar.gz", "has_sig": false, "md5_digest": "1c576edfc8936e63b4d089a86ce0719f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1349458, "upload_time": "2012-10-07T23:36:05", "url": "https://files.pythonhosted.org/packages/5b/49/defd827b4a6d138f2487224e9cec879b1bd1dbbe02a66967fa498f3b3b03/inmembrane-0.9-rc3.tar.gz" } ], "0.91": [ { "comment_text": "", "digests": { "md5": "aff1b86ade3b5395b32a66d4c35c5fce", "sha256": "86d8be9a8a58fef38a7a2b194c9b8da201b683f5a3845fe1c2d78c92c44779d3" }, "downloads": -1, "filename": "inmembrane-0.91.tar.gz", "has_sig": false, "md5_digest": "aff1b86ade3b5395b32a66d4c35c5fce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1352119, "upload_time": "2012-10-08T03:06:10", "url": "https://files.pythonhosted.org/packages/19/ea/58d6dc1cd0cae35506bf7f1f1cef03c44139aa321549aa8de9b2a5c9b185/inmembrane-0.91.tar.gz" } ], "0.92": [ { "comment_text": "", "digests": { "md5": "b1f4cd8b777c6644a96386ee2d357647", "sha256": "59b23a4bd043a5c8e168b5696e5d6a1bc9fa834b7e4d6d5dffbfcfb54be5af4f" }, "downloads": -1, "filename": "inmembrane-0.92.tar.gz", "has_sig": false, "md5_digest": "b1f4cd8b777c6644a96386ee2d357647", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1352157, "upload_time": "2012-10-23T04:47:52", "url": "https://files.pythonhosted.org/packages/d1/09/73df024fb9194e508e24ed063dca2b13e815f3e85462227364280b39aa85/inmembrane-0.92.tar.gz" } ], "0.93": [ { "comment_text": "", "digests": { "md5": "1998d3637c9df17c12f366172dd58fc1", "sha256": "e226ec480832127e140955fb02666cb2ba8b792f5eccb6f1aa65275a79a5ad84" }, "downloads": -1, "filename": "inmembrane-0.93.tar.gz", "has_sig": false, "md5_digest": "1998d3637c9df17c12f366172dd58fc1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1355965, "upload_time": "2012-12-03T00:35:44", "url": "https://files.pythonhosted.org/packages/48/e4/8f71e2c305bcbfa8efc73322f9d28d0fe9dba59718324eb37b357d62e250/inmembrane-0.93.tar.gz" } ], "0.93.2": [ { "comment_text": "", "digests": { "md5": "dd937bdf212acd932052d8969d043186", "sha256": "60f8abaf57f61c2ee2290779b8eb9733bf9cd288fbc9ad794f46a22f30978542" }, "downloads": -1, "filename": "inmembrane-0.93.2.tar.gz", "has_sig": false, "md5_digest": "dd937bdf212acd932052d8969d043186", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1750186, "upload_time": "2012-12-06T05:17:31", "url": "https://files.pythonhosted.org/packages/4c/e6/29abab3d3076ace74395a22b42568c403b921972cd5f1c105fbb16de8939/inmembrane-0.93.2.tar.gz" } ], "0.94": [ { "comment_text": "", "digests": { "md5": "176a3d1b367c556b195cb115ecf1210d", "sha256": "c249f0cf2fb9842be5bbdee8bbc80c5753f345bf3bcf4bc3f64bf352b7789cb7" }, "downloads": -1, "filename": "inmembrane-0.94.tar.gz", "has_sig": false, "md5_digest": "176a3d1b367c556b195cb115ecf1210d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1751431, "upload_time": "2013-09-29T00:03:23", "url": "https://files.pythonhosted.org/packages/46/96/d74f36ba73f3f8a01d3e58dea94f0fabb07e35d1230d05b2b8d1b42b9456/inmembrane-0.94.tar.gz" } ], "0.94.1": [ { "comment_text": "", "digests": { "md5": "7c41937a56ffc9091a8157c164581de3", "sha256": "7897dacd7770cfc08a5ea9a7f9d2c1081f352275b2c574e15edaae3206fb467c" }, "downloads": -1, "filename": "inmembrane-0.94.1.tar.gz", "has_sig": false, "md5_digest": "7c41937a56ffc9091a8157c164581de3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1750567, "upload_time": "2014-11-18T06:50:36", "url": "https://files.pythonhosted.org/packages/1e/2f/bfb3e7e15cdc5f746ec248b5557264d97f6dcddbe5baeb2e1100d25ec280/inmembrane-0.94.1.tar.gz" } ], "0.94.2": [ { "comment_text": "", "digests": { "md5": "d5841e9b0fdcf3391f474c3e5ab55509", "sha256": "483722ed64632e5f3bfde926a43a21184fe6f3cc8618d48c851b2c35fa1c7274" }, "downloads": -1, "filename": "inmembrane-0.94.2.tar.gz", "has_sig": false, "md5_digest": "d5841e9b0fdcf3391f474c3e5ab55509", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1755216, "upload_time": "2015-01-29T02:54:06", "url": "https://files.pythonhosted.org/packages/65/36/7663a42b55cf757fda0ae22765cd75c7c9c19a989f535965b3ca8baee4a0/inmembrane-0.94.2.tar.gz" } ], "0.95.0": [ { "comment_text": "", "digests": { "md5": "680653f51075ffb09ff9782e287f03e6", "sha256": "f19f36cd3799129b85a70273d2f9f42b9b04a6cd01f4183ff2c1b20e1db87557" }, "downloads": -1, "filename": "inmembrane-0.95.0.tar.gz", "has_sig": false, "md5_digest": "680653f51075ffb09ff9782e287f03e6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1751253, "upload_time": "2018-01-24T06:33:54", "url": "https://files.pythonhosted.org/packages/68/05/475b175b9ae884cbe1abdc8c3cd57bf9545504350cab0fd54104b9d00039/inmembrane-0.95.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "680653f51075ffb09ff9782e287f03e6", "sha256": "f19f36cd3799129b85a70273d2f9f42b9b04a6cd01f4183ff2c1b20e1db87557" }, "downloads": -1, "filename": "inmembrane-0.95.0.tar.gz", "has_sig": false, "md5_digest": "680653f51075ffb09ff9782e287f03e6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1751253, "upload_time": "2018-01-24T06:33:54", "url": "https://files.pythonhosted.org/packages/68/05/475b175b9ae884cbe1abdc8c3cd57bf9545504350cab0fd54104b9d00039/inmembrane-0.95.0.tar.gz" } ] }