{ "info": { "author": "Emily Jane McTavish", "author_email": "ejmctavish@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "#TreeToReads\n\nSimulation pipeline to generate next generation sequencing reads from realistic phylogenies. \nCan be used to test effects of model of evolution, rates of evolution, \ngenomic distribution of mutations, and phylogenetic relatedness of samples and of reference genome \non SNP calling and evolutionary inference. \n\nInputs are a phylogeny, a genome to be used as a tip in the tree,\nand a set of configuration parameters in a control file\n\nOptional inputs can include a sequencing error model parameterized from empirical data,\nand a distribution for the distances separating pairs of mutations in the genome.\n\nOutputs are mutated genomes representing all tips in the phylogeny, \nand simulated whole genome sequencing reads representing those genomes. \nThese are are useful for testing and comparison of analysis pipelines.\nMutations are currently only single nucleotide variants - no indels or rearrangements.\n\nThe code is still in development - but testing welcome, and will be supported via email ejmctavish, gmail. \n\n##Requirements:\n\n- Seq-Gen\n- Art\n\n(can be run without Art if you want to generate mutated genomes, but not reads)\n- Samtools (to output sorted bam files instead of sam)\n\npython packages\n- Dendropy\n\n\n-------------------------\n\n##To install requirements\n###Install Dendropy\n\n pip2 dendropy\n\n##### Install seq-gen, software to simulate mutations (http://tree.bio.ed.ac.uk/software/seqgen/) \non ubuntu using apt-get: \n\n sudo apt-get install seq-gen\n\non Mac or linux (using homebrew, http://brew.sh/): \n\n brew install seq-gen\n\n\n##### Art and Samtools are optional, but are required to generate reads from simulated genomes\n##### Install ART, software to generate short reads from simulated genomes (http://www.niehs.nih.gov/research/resources/software/biostatistics/art/)\n\non ubuntu using apt-get: \n\n wget http://www.niehs.nih.gov/research/resources/assets/docs/artbinvanillaicecream031114linux64tgz.tgz\n tar -xzvf artbinvanillaicecream031114linux64tgz.tgz\n\nadd art_illumina to path (see http://askubuntu.com/questions/60218/how-to-add-a-directory-to-my-path)\n\non Mac or linux (using homebrew): \n\n brew install art\n\n##### Install samtools, to generate sorted bam files from sam files (and save disk space) (http://www.htslib.org/)\n\non ubuntu using apt-get: \n\n apt-get install samtools\n\non Mac or linux (using homebrew): \n\n brew install samtools\n\n-----------------------------------------------------------\n##Running the simulations (quick version):\n\n git clone https://github.com/snacktavish/TreeToReads.git\n cd TreeToReads\n python treetoreads.py seqsim.cfg\n \nEdit config file, seqsim.cfg, to fit your data.\nThe script by default look for a file called 'seqsim.cfg'\nor first argument can be the path to a control file with any name.\n\nCurrently only runs art_illumina and generates paired end illumina data.\nAlternatively, genomes can be generated, and ART run separately using any chosen parameters.\n\n### [Full Tutorial](https://github.com/snacktavish/TreeToReads/blob/master/docs/tutorial.md)\n\n---------------------------------------------------------\n##Expected output\nThe script print out the parameter values and some other useful info.\nIf it runs successfully it will end with\n\"TreeToReads completed successfully!\"\n\nThe output files will be in the the output directory specified in the \nseqsim.cfg file, e.g. example_out\nand will consist of:\n\n##Key files\nseqeunce names are prefixed by 'sim_' \nfasta_files - a folder containing the simulated genomes for each tip in the tree \nfastq - folder containing folders with the names of each tip from the simulation tree, in each of these folders is the gziped simulated fastq.\nmutsites.txt - unordered list of the locations of mutations in the genome \n\n###Other files generated by analysis (mostly useless) \nanalysis_configuration.cfg - a copy of the control file used for the analysis \nseqgen.out - output messages form the seq-gen software \nsimtree.tre.bu - a backup copy of the tree \nsimtree.tre - the tree used for simulations: reformatted and polytomies randomly resolved with 0 length branches \nanalysis.sh - the bash commands run by the analysis \nThese folders contain the simulated read in fastq format \nseqs_sim.txt - an intermediate file used for generating variable sites \nSNPmatrix - a file in format SEQUENCE, BASE, POSITION describing all variable sites in the genome \nart_log - log messages from ART software \n\n###Docker container\nTreeToReads is also available as a [Docker](https://www.docker.com/) container:\n\n\tdocker pull snacktavish/treetoreads\n\tdocker run snacktavish/treetoreads seqsim.cfg\n\t\nto run the default example, or\n\n\tdocker run -v /an/example/path:/a/container/path snacktavish/treetoreads /a/container/path/my_treetoreads_config.cfg\n\t\nto run on real data, where ```/an/example/path/``` contains the file ```my_treetoreads_config.cfg```.\n\n(See the [Docker manual](http://docs.docker.com/engine/reference/run/#volume-shared-filesystems) for more information about mounting host directories in the container.)\n\n\n----------------------------------------------------------------------------------------\n\n### Citations\nThis tool relies on Dendropy, ART, and Seqgen.\nPlease cite them (as well as this repo) in any published work using this simulation pipeline (appropriate citations below)\n\nMcTavish E. J., Timme R, (2015) Tree To Reads. https://github.com/snacktavish/TreeToReads \n\nHuang W., Li L, Myers J. R., Marth G. T. (2012). ART: a next-generation sequencing read simulator, Bioinformatics 28 (4): 593-594 \n\nLi H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009) The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9 \n\nRambaut A. and Grassly N. C. (1997) Seq-Gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput. Appl. Biosci. 13: 235-238 \n\nSukumaran, J. and Mark T. Holder. 2010. DendroPy: A Python library for phylogenetic computing. Bioinformatics 26: 1569-1571.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/snacktavish/TreeToReads", "keywords": null, "license": "No copyright", "maintainer": null, "maintainer_email": null, "name": "TreeToReads", "package_url": "https://pypi.org/project/TreeToReads/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/TreeToReads/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/snacktavish/TreeToReads" }, "release_url": "https://pypi.org/project/TreeToReads/0.0.3/", "requires_dist": null, "requires_python": null, "summary": "Tree to Reads - A python script to to read a tree, resolve polytomies, generate mutations and simulate NGS reads.", "version": "0.0.3" }, "last_serial": 1901401, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "4c9fb172c0294447bf98964d188fccf0", "sha256": "57734bf78416b8a1fde22f9339fe10e29867e529f3634263b91848ff90e08562" }, "downloads": -1, "filename": "TreeToReads-0.0.2.tar.gz", "has_sig": false, "md5_digest": "4c9fb172c0294447bf98964d188fccf0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11423, "upload_time": "2016-01-12T04:45:45", "url": "https://files.pythonhosted.org/packages/ec/d2/4ba93572426be726a269921052dd1b3eb42983b0c4534d286d111cf3485b/TreeToReads-0.0.2.tar.gz" } ], "0.0.2a": [ { "comment_text": "", "digests": { "md5": "7d3dfb678052a1421762955e1e7e2bbc", "sha256": "40ea09ca0fabb0f52c9ce7dd36089962f245ed53e660fa402112cc360b0059ad" }, "downloads": -1, "filename": "TreeToReads-0.0.2a.tar.gz", "has_sig": false, "md5_digest": "7d3dfb678052a1421762955e1e7e2bbc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12007, "upload_time": "2016-01-12T04:51:13", "url": "https://files.pythonhosted.org/packages/70/4e/f9cf62d3be34ca01024ee8a97adf5a0ab2bc7b32e21832ec5f18e1e2ccc7/TreeToReads-0.0.2a.tar.gz" } ], "0.0.2b": [ { "comment_text": "", "digests": { "md5": "1be1b4382fe3f3e7bee68c73e2ec6b17", "sha256": "8f3a9a1a7d32c6f7248a54ad59850a01769827cbfc0bac5fe5c87c38bc359310" }, "downloads": -1, "filename": "TreeToReads-0.0.2b.tar.gz", "has_sig": false, "md5_digest": "1be1b4382fe3f3e7bee68c73e2ec6b17", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12013, "upload_time": "2016-01-12T04:54:29", "url": "https://files.pythonhosted.org/packages/2e/3e/c537617a2aab93235a9b1a8b03d08d156a61a538dfc2d3f130fd3416e1e2/TreeToReads-0.0.2b.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "29c022da6d98773e8dab459299c2a0d2", "sha256": "c4298aa161b837a409ba1250d61afcafd7d9266b5edc2ebd4077fe503134d904" }, "downloads": -1, "filename": "TreeToReads-0.0.3.tar.gz", "has_sig": false, "md5_digest": "29c022da6d98773e8dab459299c2a0d2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12158, "upload_time": "2016-01-12T19:43:30", "url": "https://files.pythonhosted.org/packages/8f/84/1468f316497e3679ef84845ad967468706affb70dbc4d96bf86d7da5d813/TreeToReads-0.0.3.tar.gz" } ], "0.1dev": [ { "comment_text": "", "digests": { "md5": "34c95fbc2f133986cf7eb039103072aa", "sha256": "5ee716c0cb74072b5c942da69be22d329895210b682a6a25fb5c6e1215dd416b" }, "downloads": -1, "filename": "TreeToReads-0.1dev.tar.gz", "has_sig": false, "md5_digest": "34c95fbc2f133986cf7eb039103072aa", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11429, "upload_time": "2016-01-12T04:40:49", "url": "https://files.pythonhosted.org/packages/46/e7/6e61e5f7f46f1321b3276161abb24ec7b32c9c2a54743dfac013e25a7a0a/TreeToReads-0.1dev.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "29c022da6d98773e8dab459299c2a0d2", "sha256": "c4298aa161b837a409ba1250d61afcafd7d9266b5edc2ebd4077fe503134d904" }, "downloads": -1, "filename": "TreeToReads-0.0.3.tar.gz", "has_sig": false, "md5_digest": "29c022da6d98773e8dab459299c2a0d2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12158, "upload_time": "2016-01-12T19:43:30", "url": "https://files.pythonhosted.org/packages/8f/84/1468f316497e3679ef84845ad967468706affb70dbc4d96bf86d7da5d813/TreeToReads-0.0.3.tar.gz" } ] }