{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [], "description": "# Running SISRS as a User\n\n## Install Docker\nFollow the instructions [here](https://docs.docker.com/install/) to install\nDocker CE for your operating system.\n\nThere's quite a bit going on in those instructions. As an example, if you're\non Ubuntu you'll follow the specific instructions\n[here](https://docs.docker.com/install/linux/docker-ce/ubuntu/).\n\nIgnore anything that talks about Docker EE. They're just trying to sell you\nstuff.\n\nNote that SISRS currently only runs\non Linux.\n\n## Getting the Docker image\n\nDownload the SISRS docker image which comes with all the dependencies for\nrunning SISRS.\n\n`docker pull anderspitman/sisrs`\n\n## Running SISRS\n\nFirst start up a docker container using the image obtained above:\n\n`docker run -it anderspitman/sisrs bash`\n\nThen from within the Docker container:\n\n```\npip install sisrs\nsisrs-python\n```\n\n\n# Developing SISRS\n\n\nSISRS\n=====\n\nSISRS: Site Identification from Short Read Sequences \nVersion 1.6.2 \nCopyright (c) 2013-2016 Rachel Schwartz \nhttps://github.com/rachelss/SISRS \nMore information: Schwartz, R.S., K.M Harkins, A.C. Stone, and R.A. Cartwright. 2015. A composite genome approach to identify phylogenetically informative data from next-generation sequencing. BMC Bioinformatics. 16:193.\n(http://www.biomedcentral.com/1471-2105/16/193/)\n\nTalk from Evolution 2014 describing SISRS and its application: \nhttps://www.youtube.com/watch?v=0OMPuWc-J2E&list=UUq2cZF2DnfvIUVg4tyRH5Ng\n\nLicense\n=======\n\nThis program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.\n\nThis program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.\n\nRequirements\n============\n* Built-In Genome Assemblers (Required if SISRS is building your composite genome)\n * Velvet (tested with v.1.2.10) (http://www.ebi.ac.uk/~zerbino/velvet/)\n * Minia (tested with v.2.0.7) (http://minia.genouest.org/)\n * AbySS (tested with v.2.0.2) (http://www.bcgsc.ca/platform/bioinfo/software/abyss)\n* Bowtie2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml)\n* Python 2.7, Biopython, and PySAM\n* Samtools v1.3.1 (http://www.htslib.org/)\n* GNU Parallel (http://www.gnu.org/software/parallel/)\n* MAFFT (http://mafft.cbrc.jp/alignment/software/)\n* BBMap [requires Java] (https://sourceforge.net/projects/bbmap/)\n\n\nInput\n=====\n\nNext-gen sequence data such as Illumina HiSeq reads.\nData must be sorted into folders by taxon (e.g. species or genus).\nPaired reads in fastq format must be specified by _R1 and _R2 in the (otherwise identical) filenames.\nPaired and unpaired reads must have a fastq file extension.\n\nRunning SISRS\n=============\n\n## Usage:\n\n sisrs command options\n\n ### By default, SISRS assumes that\n\n * A reference genome is not available and a composite assembly will be\n assembled using Velvet\n * The K-mer size to be used by Velvet in contig assembly is 21.\n * Only one processor is available.\n * Files are in fastq format.\n * Paired read filenames end with _R1 and _R2\n * A site is only required to have data for two species to be included in the\n final alignment.\n * Folders containing reads are in the present working directory\n * SISRS data will be output into the present working directory\n * A minimum of three reads are required to call the base at a site\n for a taxon.\n\n### Commands: \n\n**sites**: produce an alignment of sites from raw reads \n\n**loci**: produce a set of aligned loci based on the most variable regions of the composite genome \n\n\n#### Subcommands of sites:\n\n**subSample**: run sisrs subsampling scheme, subsampling reads from all taxa to ~10X coverage across species, relative to user-specified genome size \n\n**buildContigs**: given subsampled reads, run sisrs composite genome assembly with user-specified assembler \n\n**alignContigs**: align reads to composite genome as single-ended, uniquely mapped \n\n**mapContigs**: align composite genome reads to a reference genome (optional) \n\n**identifyFixedSites**: find sites with no within-taxa variation \n\n**outputAlignment**: output alignment file of sisrs sites \n\n**changeMissing**: given alignment of sites (alignment.nex), output a file with only sites missing fewer than a specified number of samples per site \n\n\n #### Option Flags:\n\n * -g : the approximate genome size (**mandatory** if sisrs will be assembling a composite genome)\n * -p : use this number of processors *[Default: 1]*\n * -r : the path to the reference genome in fasta format *[Optional]*\n * -k : k-mer size (for assembly) *[Default: 21]* \n * -f : absolute path to the directory containing the folders of reads *[Default: Current Directory]*\n * -z : absolute path to either empty or non-existent directory where SISRS will output data *[Default: Current Directory]*\n * -n : the number of reads required to call a base at a site *[Default: 3]*\n * -t : the threshold for calling a site; e.g. 0.99 means that >99% of bases for that taxon must be one allele; only recommended for low ploidy with <3 individuals *[Default: 1 (100%)]*\n * -m : the number of species that are allowed to have missing data at a site\n * -o : the length of the final loci dataset for dating \n * -l : the number of alleles \n * -a : assembler [velvet, minia, abyss, or premade; *Default: velvet*]\n - If using a premade composite genome, it must be in a folder named 'premadeoutput' in the same directory as the folders of read data, and must be called 'contigs.fa'\n * -s : Sites to analyze when running 'loci' [0,1,2]\n - 0 [Default], all variable sites, including singletons\n - 1, variable sites excluding singletons\n - 2, only biallelic variable sites\n * -c : continuous command mode for calling subcommands [1,0] \n - 1 [Default]: calling a subcommand runs that subcommand **and all subsequent steps in the pipeline**\n - 0: calling a subcommand runs **only** that subcommand\n\nOutput\n======\n\nNexus file with variable sites in a single alignment. Usable in most major phylogenetics software as a concatenated alignment with a setting for variable-sites-only.\n\nTest Data\n=========\n\nThe folder test_data (https://github.com/rachelss/SISRS_test_data) contains simulated data for 10 species on the tree found in simtree.tre . Using 40 processors this run took 9 minutes. Analysis of the alignment output by sisrs using raxml produced the correct tree.\n\nSample commands\n==============\n\n1. Basic sisrs run: start with fastq files and produce an alignment of variable sites\n```\nsisrs sites -g 1745690\n```\n2. Basic sisrs run with modifications\n```\nsisrs sites -g 1745690 -p 40 -m 4 -f /usr/test_data -z /usr/output_data -t .99 -a minia\n```\n3. Run only sisrs read subsampling step\n```\nsisrs subSample -g 1745690 -f /usr/test_data -c 0\n```\n4. Produce an alignment of loci based on the most variable loci in your basic sisrs run. Note - this command will run sisrs sites if (and only if) it was not run previously.\n```\nsisrs loci -g 1745690 -p 40 -l 2 -f /usr/test_data # Will run sites first, then loci\nsisrs loci -g 1745690 -p 40 -l 2 -f /usr/SISRS_sites_ouput # Will run loci from previous sites data\n```\n5. Get loci from your fastq files given known loci.\n\n first name your reference loci ref_genes.fa and put in your main folder\n```\nsisrs loci -p 40 -f /usr/test_data\n```\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://cartwrig.ht/software/", "keywords": "bioinformatics genetics", "license": "GPLv3", "maintainer": "", "maintainer_email": "", "name": "sisrs", "package_url": "https://pypi.org/project/sisrs/", "platform": "", "project_url": "https://pypi.org/project/sisrs/", "project_urls": { "Homepage": "https://cartwrig.ht/software/" }, "release_url": "https://pypi.org/project/sisrs/0.1.0/", "requires_dist": [ "biopython" ], "requires_python": ">=3.6", "summary": "Site Identification from Short Read Sequences.", "version": "0.1.0" }, "last_serial": 3715711, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "3528046ff1c477e60af1e45d39cabc2c", "sha256": "2d035269f4eadc448ec92c3e9ede1f486faa468a1b56fe62e0f396fb1d2b98d1" }, "downloads": -1, "filename": "sisrs-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "3528046ff1c477e60af1e45d39cabc2c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 26660, "upload_time": "2018-03-29T02:51:11", "url": "https://files.pythonhosted.org/packages/81/83/1472f81b5483ad7613952b1f8f537a22dcb4c0d6eaa1c4392c2f9d13fd06/sisrs-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e766c89a5a05c0941b9a3bdd31d03827", "sha256": "886596c0e2f5651400adf7f562c1b9f10a9ec647b67d41893b5e1baa3b7eb5ce" }, "downloads": -1, "filename": "sisrs-0.1.0.tar.gz", "has_sig": false, "md5_digest": "e766c89a5a05c0941b9a3bdd31d03827", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 20798, "upload_time": "2018-03-29T02:51:12", "url": "https://files.pythonhosted.org/packages/49/90/8b6f033add96ce92450ad1967504229d05c39b21b50bf8a3429bcac1fa8d/sisrs-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "3528046ff1c477e60af1e45d39cabc2c", "sha256": "2d035269f4eadc448ec92c3e9ede1f486faa468a1b56fe62e0f396fb1d2b98d1" }, "downloads": -1, "filename": "sisrs-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "3528046ff1c477e60af1e45d39cabc2c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 26660, "upload_time": "2018-03-29T02:51:11", "url": "https://files.pythonhosted.org/packages/81/83/1472f81b5483ad7613952b1f8f537a22dcb4c0d6eaa1c4392c2f9d13fd06/sisrs-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e766c89a5a05c0941b9a3bdd31d03827", "sha256": "886596c0e2f5651400adf7f562c1b9f10a9ec647b67d41893b5e1baa3b7eb5ce" }, "downloads": -1, "filename": "sisrs-0.1.0.tar.gz", "has_sig": false, "md5_digest": "e766c89a5a05c0941b9a3bdd31d03827", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 20798, "upload_time": "2018-03-29T02:51:12", "url": "https://files.pythonhosted.org/packages/49/90/8b6f033add96ce92450ad1967504229d05c39b21b50bf8a3429bcac1fa8d/sisrs-0.1.0.tar.gz" } ] }