{ "info": { "author": "Mitchell Stanton-Cook", "author_email": "m.stantoncook@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "Intended Audience :: Science/Research", "License :: OSI Approved", "Natural Language :: English", "Operating System :: POSIX :: Linux", "Programming Language :: Python", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 2 :: Only", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": ".. image:: https://raw.github.com/mscook/BanzaiDB/master/misc/BanzaiDB.png\n :alt: BanzaiDB logo\n\n----\n\n.. image:: http://gitshields.com/v2/text/API/Unstable/red.png\n :alt: API stability\n\n|\n\n.. image:: https://landscape.io/github/mscook/BanzaiDB/master/landscape.png\n :target: https://landscape.io/github/mscook/BanzaiDB/master\n :alt: Code Health\n\n|\n\n.. image:: http://gitshields.com/v2/drone/github.com/mscook/BanzaiDB/brightgreen-red.png\n :target: https://drone.io/github.com/mscook/BanzaiDB\n :alt: Build status (Drone.io)\n\n(Using landscape.io and drone.io)\n\n\nNews\n----\n\n**API changes (v1 -> v2 -> v3 -> v?)**. The population of a mapping run \ninto BanzaiDB was dependent on a nesoni run. Originally we (API v1) parsed \nthe reports.txt for each strain. In API v2 we parse the nway.any (assumes you \nhave ran nesoni nway). BanzaiDB API v3 assumes that you still have accessto \nthe consensus.fa (called a consensus). We need this data to store information \nin BanzaiDB about coverage.\n\n\nWhat is BanzaiDB?\n-----------------\n\n**Please use the releases (https://github.com/mscook/BanzaiDB/releases). All \nversions including most recent made some significant assumptions that are \ncurrently being improved.**\n\nBanzaiDB is a tool for pairing Microbial Genomics Next Generation Sequencing \n(NGS) analysis with a NoSQL_ database. We use the RethinkDB_ NoSQL database.\n\nBanzaiDB:\n * initialises the NoSQL database and associated tables,\n * populates the database with results of NGS experiments/analysis and,\n * provides a set of query functions to wrangle with the data stored within \n the database.\n\n\nWhy BanzaiDB?\n-------------\n\nThe analysis (primary/secondary/tertiary) of large collections of draft \nmicrobial genomes from NGS typically generates many separate flat files. \n\nThe bioinformatician will:\n * write scripts to parse and extract the important information from \n the results files (often trying to standardise the output from \n similar programs),\n * store these results in further flat files,\n * write scripts to link the results of one analysis step to another,\n * store these results in further flat files,\n * modify scripts as hypothesis is improved as a direct consequence of\n incorporating the knowledge from the previous steps,\n * ...\n * ...\n * ...\n * end up with thousands of flat files, many scripts and generally get \n confused as to how and where everything came from.\n\n**The idea around BanzaiDB is to run once, store once analyse many times.**\n\n\nAbout BanzaiDB\n--------------\n\nBanzaiDB is geared towards outputs of Bioinformatics software employed by \nthe `Banzai NGS pipeline`_. \n\nBanzaiDB is thus geared towards handling data generated from:\n * Velvet and SPAdes (assembly), \n * BWA and Nesoni (mapping/variant calling),\n * Mugsy (whole genome alignment), \n * BRATTNextGen (recombination detection) and,\n * Prokka (annotation).\n\n*The present focus is on storing and manipulating the results of SNP and \nrecombination analysis.*\n\n**Banzai is not a stable API.** \n\nSee the ReadTheDocs site for `BanzaiDB documentation`_ (User & API).\n\n\nAbout RethinkDB\n---------------\n\nWe choose RethinkDB_ as our underlying database for a few reasons:\n * RethinkDB is both developer and operations friendly. This sits well with \n the typical bioinformatician,\n * NoSQL databases allow for a flexible schema. We can store/collect now, \n think later. This is much like how science is performed.\n * Not every bioinformatician or lab has a system administrator. RethinkDB \n is easy to setup and administer\n * We don't know how big our complex our datasets could get in the future. \n It is easy to scale RethinkDB into a cluster.\n * ReQL the underlying query language is nice and simple to\n learn/understand. We're also very comfortable with Python and the \n availability of official python drivers (also JavaScript & Ruby, and a \n heap for user contributed for a swag of languages) is a big bonus.\n\n\nBanzaiDB Requirements\n---------------------\n\nYou will need:\n * (probably) administrator access to your machine(s)\n * a RethinkDB_ server/instance. This can be running locally or on a VPS, \n * git (to clone this repository) and\n * pip_\n * bedtools, samtools and tabix (for pybedtools)\n\nYou will also need a few Python modules:\n * rethinkdb\n * biopython\n * reportlab\n * fabric\n * tablib\n * argparse (if Python 2.6)\n * pybedtools (you will probably also need to install cython)\n\n\nThe Python modules should/will be pulled down automatically when installing \nBanzaiDB.\n\nWe recommend you increase the rethinkdb python `driver performance`_. We have \nfound that in some cases the installation of C++ backend fails. `We provide`_ \na simple protocol that we have found works.\n\n\nBanzaiDB Installation\n---------------------\n\nSomething like this::\n\n $ git clone https://github.com/mscook/BanzaiDB.git\n $ cd BanzaiDB\n $ python setup.py install\n\n\nGetting BanzaiDB talking to RethinkDB\n-------------------------------------\n\nYou provide information about you RethinkDB instance and database using the \nfile **~/.BanzaiDB.cfg** (~/ is shorthand for $HOME).\n\nThe configuration file supports::\n\n db_host = [def = localhost]\n port = [def = 28015]\n db_name = [def = Banzai]\n auth_key = [def = '']\n\n\nBanzaiDB usage\n--------------\n\n**Note:** Please refer to the `BanzaiDB documentation`_ (via ReadTheDocs) for \nmore detailed information (under active development).\n\nOnce both RethinkDB and BanzaiDB are installed and the configuration is set::\n\n $ python BanzaiDB.py -h\n usage: BanzaiDB.py [-h] [-v] {init,populate,update,query} ...\n\n BanzaiDB v 0.3.0 - Database for Banzai NGS pipeline tool\n (http://github.com/mscook/BanzaiDB)\n\n positional arguments:\n {init,populate,update,query}\n Available commands:\n init Initialise a DB\n populate Populates a database with results of an experiment\n update Updates a database with results from a new experiment\n query List available or provide database query functions\n\n optional arguments:\n -h, --help show this help message and exit\n -v, --verbose verbose output\n\n Licence: ECL 2.0 by Mitchell Stanton-Cook \n\n\n\n.. _RethinkDB: http://www.rethinkdb.com\n.. _NoSQL: http://nosql-database.org\n.. _Banzai NGS pipeline: https://github.com/mscook/Banzai-MicrobialGenomics-Pipeline\n.. _BanzaiDB documentation: http://banzaidb.readthedocs.org\n.. _driver performance: http://www.rethinkdb.com/docs/driver-performance/\n.. _pip: http://pip.readthedocs.org/en/latest/installing.html\n.. _We provide: https://raw.githubusercontent.com/mscook/BanzaiDB/master/misc/python_C++_driver.sh", "description_content_type": null, "docs_url": null, "download_url": null, "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/mscook/BanzaiDB", "keywords": null, "license": "ECL 2.0", "maintainer": null, "maintainer_email": null, "name": "BanzaiDB", "package_url": "https://pypi.org/project/BanzaiDB/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/BanzaiDB/", "project_urls": { "Homepage": "http://github.com/mscook/BanzaiDB" }, "release_url": "https://pypi.org/project/BanzaiDB/0.3.0/", "requires_dist": [ "reportlab", "rethinkdb", "biopython", "fabric", "tablib", "argparse" ], "requires_python": null, "summary": "Database for Banzai NGS pipeline tool", "version": "0.3.0" }, "last_serial": 1186209, "releases": { "0.1.2": [ { "comment_text": "", "digests": { "md5": "0b70a45535ba9d2b035c25ccd5d0cbce", "sha256": "73ba95db37bc519ec43b1f8edf101759a4d09b55888e71221d9272636919bd8f" }, "downloads": -1, "filename": "BanzaiDB-0.1.2-py2-none-any.whl", "has_sig": false, "md5_digest": "0b70a45535ba9d2b035c25ccd5d0cbce", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 166517, "upload_time": "2014-07-09T02:56:57", "url": "https://files.pythonhosted.org/packages/f0/50/418ebad79153ab3e16e8e3a2cf387d7410916eb82539a35845bd184aee26/BanzaiDB-0.1.2-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "bd802c8cb95177a5e7f7c649e7d76c55", "sha256": "dc7d1f520f298cb72dfd33abc5af2d71a9e78d23d647834f865e2fbd1f844324" }, "downloads": -1, "filename": "BanzaiDB-0.1.2.tar.gz", "has_sig": false, "md5_digest": "bd802c8cb95177a5e7f7c649e7d76c55", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 145348, "upload_time": "2014-07-09T02:57:00", "url": "https://files.pythonhosted.org/packages/8d/74/7114de100cde30f3694a802e9748f2259b1d65e2e2a74296148a4fc25d11/BanzaiDB-0.1.2.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "4b14dcb1e0237349cc8869348a1b4fa4", "sha256": "f389a8b76daa7ca1f0003516b534800c6bf47d11005791536c346cac7ef8dd01" }, "downloads": -1, "filename": "BanzaiDB-0.2.0-py2-none-any.whl", "has_sig": false, "md5_digest": "4b14dcb1e0237349cc8869348a1b4fa4", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 169486, "upload_time": "2014-08-08T00:51:49", "url": "https://files.pythonhosted.org/packages/13/ca/fe054c93f5564001368d9179cd9a4f738f41ebc4dc153c8e9a5c306e4f25/BanzaiDB-0.2.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "baac10616c2bd957db4d74bc404d276d", "sha256": "3ecab2c45457d444e57d691319c3e3992f7ecba7f95ce0a542d7f97b31d13ff2" }, "downloads": -1, "filename": "BanzaiDB-0.2.0.tar.gz", "has_sig": false, "md5_digest": "baac10616c2bd957db4d74bc404d276d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 147583, "upload_time": "2014-08-08T00:51:52", "url": "https://files.pythonhosted.org/packages/32/28/3e9b6b3e503d682b1f9705d891b4650db7b377ea00618b9553f97080eccf/BanzaiDB-0.2.0.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "77443af7ee77d1df81e21e1713656bf4", "sha256": "c6f03b2f70c56a0884903df08e639d821f03ddc410fa70cb7fadb84abdd31b19" }, "downloads": -1, "filename": "BanzaiDB-0.3.0-py2-none-any.whl", "has_sig": false, "md5_digest": "77443af7ee77d1df81e21e1713656bf4", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 177644, "upload_time": "2014-08-11T06:10:36", "url": "https://files.pythonhosted.org/packages/20/43/4836eed6d19a90bfb87fcd0b25e49aebe34073e2f608e91acc7d53d9f2fb/BanzaiDB-0.3.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2a11e16864331d769ad1c06dec0a358b", "sha256": "9a4fdfe3a8bd785ce3545dfcbe97ed735b79011e30c6dc6e774db3abb3b8088a" }, "downloads": -1, "filename": "BanzaiDB-0.3.0.tar.gz", "has_sig": false, "md5_digest": "2a11e16864331d769ad1c06dec0a358b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 158230, "upload_time": "2014-08-11T06:10:40", "url": "https://files.pythonhosted.org/packages/c0/81/2f1603c2f5d8f9a5a50ea2be8c6d99fb014eed3b5d202c98ce1b49f33949/BanzaiDB-0.3.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "77443af7ee77d1df81e21e1713656bf4", "sha256": "c6f03b2f70c56a0884903df08e639d821f03ddc410fa70cb7fadb84abdd31b19" }, "downloads": -1, "filename": "BanzaiDB-0.3.0-py2-none-any.whl", "has_sig": false, "md5_digest": "77443af7ee77d1df81e21e1713656bf4", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 177644, "upload_time": "2014-08-11T06:10:36", "url": "https://files.pythonhosted.org/packages/20/43/4836eed6d19a90bfb87fcd0b25e49aebe34073e2f608e91acc7d53d9f2fb/BanzaiDB-0.3.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2a11e16864331d769ad1c06dec0a358b", "sha256": "9a4fdfe3a8bd785ce3545dfcbe97ed735b79011e30c6dc6e774db3abb3b8088a" }, "downloads": -1, "filename": "BanzaiDB-0.3.0.tar.gz", "has_sig": false, "md5_digest": "2a11e16864331d769ad1c06dec0a358b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 158230, "upload_time": "2014-08-11T06:10:40", "url": "https://files.pythonhosted.org/packages/c0/81/2f1603c2f5d8f9a5a50ea2be8c6d99fb014eed3b5d202c98ce1b49f33949/BanzaiDB-0.3.0.tar.gz" } ] }