{ "info": { "author": "Mark van Roosmalen", "author_email": "m.vanroosmalen-2@umcutrecht.nl", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "NanoSV User Guide\n=================\n\nTable of Contents\n-----------------\n\n\n* `NanoSV <#nanosv>`_\n\n * `Summary <#summary>`_\n * `Installation <#installation>`_\n * `Citation <#citation>`_\n\n* `Pre-processing <#pre-processing>`_\n\n * `Basecalling <#basecalling>`_\n * `Mapping <#mapping>`_\n * `LAST <#last-mapping>`_\n\n * `LAST installation <#last-installation>`_\n * `Running LAST <#running-last>`_\n\n* `SV calling using NanoSV <#sv-calling-using-nanosv>`_\n\n * `NanoSV usage <#nanosv-usage>`_\n * `NanoSV arguments and parameters <#nanosv-arguments-and-parameters>`_\n\n * `Required arguments <#required-arguments>`_\n * `Optional arguments <#optional-arguments>`_\n * `Optional configuration parameters <#optional-configuration-parameters>`_\n * `Ancillary files that can be used for running NanoSV <#ancillary-files-that-can-be-used-for-running-nanosv>`_\n\n * `NanoSV output <#nanosv-output>`_\n\nNanoSV\n------\n\nSummary\n^^^^^^^\n\nNanoSV is a software package that can be used to identify structural genomic variations in long-read sequencing data, such as data produced by Oxford Nanopore Technologies\u2019 MinION, GridION or PromethION instruments, or Pacific Biosciences RSII or Sequel sequencers.\nNanoSV has been extensively tested using Oxford Nanopore MinION sequencing data, as described here: \nhttps://www.nature.com/articles/s41467-017-01343-4\nhttps://link.springer.com/article/10.1007%2Fs00401-017-1743-5\n\nThe core algorithm of NanoSV identifies split- and gapped-aligned reads and clusters the reads according to the orientations and genomic positions of the read segments to define breakpoint-junctions of structural variations.\n\nInstallation\n^^^^^^^^^^^^\n\nNanoSV needs a working installation of python 3. You can install NanoSV using pip:\n\n.. code-block::\n\n > pip install nanosv\n\nAlternatively, you can get NanoSV from bioconda:\n\n.. code-block::\n\n conda install -c bioconda nanosv\n\nCitation\n^^^^^^^^\n\nCretu Stancu, M. *et al.* Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat. Commun. 8, 1326 **(2017)**. (https://www.nature.com/articles/s41467-017-01343-4)\n\nPre-processing\n--------------\n\nBasecalling\n^^^^^^^^^^^\n\nRaw sequencing data can be basecalled using any available basecaller that is suited for your data. We use albacore for MinION/GridION/PromethION sequencing data, available through the nanopore community: http://community.nanoporetech.com/. Albacore can directly produce fastq files suitable for subsequent mapping to a reference genome.\n\nMapping\n^^^^^^^\n\nNanoSV has been tested with different long read mappers, including BWA MEM, MINIMAP2, LAST and NGMLR. Below you can see how we tipically run these mappers. \n\nBWA MEM\n~~~~~~~\n\n.. code-block::\n\n > bwa mem -x ont2d -M -t 8 \n\nMINIMAP2\n~~~~~~~~\n\n.. code-block::\n\n > minimap2 -t 8 -a \n\nNGMLR\n~~~~~\n\n.. code-block::\n\n > ngmlr -x ont -t 8 -r -q \n\nLAST\n~~~~\n\nWe found that LAST alignments give the most accurate results for SV calling with NanoSV. However, mapping with LAST requires more compute resources. Follow the instructions below if you would like to use LAST alignment as input for your SV calling with NanoSV. \n\nLAST installation\n\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\n\nDownload the zip file from http://last.cbrc.jp/\n\n.. code-block::\n\n > gunzip last.zip\n > cd /path/to/lastdir\n > make\n\nRunning LAST\n\"\"\"\"\"\"\"\"\"\"\"\"\n\nFirst you need to index your reference genome by creating a lastal database:\n\n.. code-block::\n\n > lastdb [referencedb] [reference.fa]\n\nTrain LAST to get the best scoring parameters for your particular alignment. We typically use a subset of > 10,000 reads for this step:\n\n.. code-block::\n\n > last-train -Q1 [referencedb] [reads_sample.fastq] > [my-params]\n\nMap your fastq data to reference:\n\n.. code-block::\n\n > lastal -Q1 -p [my-params] [referencedb] [reads.fastq] | last-split > [reads.maf]\n\nConvert the MAF file to SAM format:\n\n.. code-block::\n\n > maf-convert -f [reference.dict] sam -r \u2018ID:[id] PL:[nanopore] SM:[sample]\u2019 [reads.maf] > [reads.sam]\n\nThe ``[reference.dict]`` file can be created by picard:\n\n.. code-block::\n\n > java -jar picard.jar CreateSequenceDictionary REFERENCE=[reference.fa] OUTPUT=[reference.dict]\n\nConvert SAM file to BAM file using sambamba (https://github.com/biod/sambamba) (samtools may function similarly):\n\n.. code-block::\n\n > sambamba view -h -S --format=bam [reads.sam] > [reads.bam]\n\nSort the BAM file using sambamba: \n\n.. code-block::\n\n > sambamba sort [reads.bam]\n\nAll of the above commands can also be run at once using pipes:\n\n.. code-block::\n\n > lastal -Q1 -p [my-params] [referencedb] [reads.fastq] | \\\n > last-split | \\\n > maf-convert -f [reference.dict] sam -r \u2018ID:[id] PL:[nanopore] SM:[sample]\u2019 /dev/stdin | \\\n > sambamba view -h -S --format=bam /dev/stdin | \\\n > sambamba sort /dev/stdin -o [reads.sorted.bam]\n\nSV calling using NanoSV\n-----------------------\n\nNanoSV usage\n^^^^^^^^^^^^\n\n.. code-block::\n\n > NanoSV [-h] [-t THREADS] [-s SAMBAMBA] [-c CONFIG] [-b BED] [-o OUTPUT] [reads.sorted.bam]\n\nNanoSV arguments and parameters:\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nrequired arguments:\n~~~~~~~~~~~~~~~~~~~\n\n.. code-block::\n\n bam : /path/to/reads.sorted.bam\n\nThis BAM file needs to be coordinate-sorted and indexed. Note that if you are performing SV calling on a large genome (e.g. human) and are only interested in calling intrachromosomal SVs, you may gain speed by splitting your BAM file by chromosome and running NanoSV per chromosome (on a compute cluster). \n\noptional arguments:\n~~~~~~~~~~~~~~~~~~~\n\n.. code-block::\n\n -h, --help : Show the help message and exit\n\n -t, --threads : Maximum number of threads to use [default: 4 ]\n\n -s, --sambamba : Give the full path to the sambamba or samtools executable [default: sambamba ]\n\n -c, --config : Give the full path to your own ini file [ default: config.ini ]\n\n -b, --bed : Give the full path to your own bed file, used for coverage depth calculations [default: human_hg19.bed ]\n\n -o, --output : Give the full path to the output vcf file [default: ]\n\noptional configuration parameters:\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nNanoSV uses a config.ini file which contains default settings for all running parameters. Users can change the parameters by creating their own config.ini file and provide this as a command line argument [-c]\n\n.. code-block::\n\n #Reads segments options\n [Filter options]\n # Maximum number of segments per read resulting from the mapping of the read to the reference sequence\n max_split = 10\n # Minimum percentage of identical bases of the mapped segment relative to the reference sequence\n min_pid = 0.7\n # Minimum mapping quality of the segment\n min_mapq = 20\n\n #Parameters for tuning detection and clustering of breakpoints:\n [Detection options]\n # Maximum distance between two adjacent break-end positions\n cluster_distance = 10\n # Minimum number of breakpoint-junctions (i.e. split-read junctions) for clustering\n cluster_count = 2\n # Minimum flanking length, to consider a read a reference read\n refreads_distance = 100\n # Minimum length of unmapped sequence for hanging reads that overlap a break-end\n hanging_length = 20\n # Maximum distance to search for the MATEID, i.e. a reciprocal breakpoint-junction, for example an inversion consist of two breakpoint-junctions (3\u2019-to-3\u2019 and 5\u2019-to-5\u2019)\n mate_distance = 300\n # If True, NanoSV will check the depth of coverage for possible breakpoint-junctions with orientations that indicate a possible deletion or duplication (3\u2019-to-5\u2019 and 5\u2019-to-3\u2019)\n depth_support = True\n # Minimum indel size to call gap and create subsegments\n min_indel_size = 30\n\n #Parameters for setting the FILTER flag in the vcf output:\n [Output filter options]\n # Filter flag: LowQual, if the QUAL score is lower\n qual_flag = 20\n # Filter flag: SVcluster, if there are more SVs within a window size, they will be marked as SVcluster\n window_size = 1000\n # Filter flag: SVcluster, indicating the number of SVs within a certain window size (set by window_size above)\n svcluster = 2\n # Filter flag: MapQual, if the median mapq is lower than specified by this parameter\n mapq_flag = 80\n # Filter flag: PID, if the median percentage identity is lower than specified by this parameter\n pid_flag = 0.80\n # Filter flag: Gap, if the median GAP is higher than specified by this parameter\n gap_flag = 100\n # Filter flag: CIPOS|CIEND, if the CIPOS|CIEND bigger than specified by this parameter\n ci_flag = 30\n\n [Phasing Options]\n ##Phasing is still experimental for now and needs to be properly tested and benchmarked, so use it under your own responsibility. \n #If True, NanoSV will use phasing as an addition in calling SVs\n phasing_on = False\n #SNP positions are stored in bins to improve speed. This setting sets the bin size\n variant_bin_size = 1000000\n #Window measured from the breakpoint in which SNPs are sought to be used in read clustering\n phasing_window = 7000\n #Minimum coverage to call a SNP for phasing\n min_coverage = 10\n #Maximum percentage of deletions on position\n max_deletions = 0.25\n #Minimum occurence of variant to call a SNP for phasing\n min_occurences_of_var = 0.4\n #Minimum occurence of high quality calls of certain variant\n min_highq_var = 0.6\n #Minimum quality to call 'high quality'\n min_base_qual_ph = 11\n #cut-off setting to stop clustering if highest similarity between reads is too low\n clustering_cutoff = 0.3\n\nAncillary files that can be used for running NanoSV:\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTo estimate a coverage increase or decrease near predicted breakpoint-junctions, the average coverage across a putative deletion or duplication interval is compared to the distribution of coverage across random positions in the reference sequence. This calculation is only performed if ``depth_support = True`` in config.ini. A default bed file is provided that contains 1,000,000 random positions on the hg19/GRCh37 human genome reference, excluding simple repeat regions (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/simpleRepeat.txt.gz) and gap regions (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gap.txt.gz). The file format is standard BED format (chr\\=3", "summary": "Structural variation detection tool for Oxford Nanopore data.", "version": "1.2.4" }, "last_serial": 5922261, "releases": { "1.0.4": [ { "comment_text": "", "digests": { "md5": "3fbf04ff6b1c05340f17fd0dc4893781", "sha256": "d3d240b3cf848ef961020618a91fdbc6ae532c626e30f62a12bcd32b1523c838" }, "downloads": -1, "filename": "NanoSV-1.0.4.tar.gz", "has_sig": false, "md5_digest": "3fbf04ff6b1c05340f17fd0dc4893781", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5816609, "upload_time": "2018-03-02T09:05:43", "url": "https://files.pythonhosted.org/packages/33/65/0f03e6cfaed7edfacfecd36cc1463e564b2b017293230da6424832df9d22/NanoSV-1.0.4.tar.gz" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "ddeebf989e8bf17bcee9ce523dda3710", "sha256": "c36a71113be56933287f34a057cea5effe59f0e389d4130853b00fd7dc249edd" }, "downloads": -1, "filename": "NanoSV-1.1.0.tar.gz", "has_sig": false, "md5_digest": "ddeebf989e8bf17bcee9ce523dda3710", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5816586, "upload_time": "2018-03-19T07:14:09", "url": "https://files.pythonhosted.org/packages/d7/e1/69de97834f93237c3568e33813de995d98893f9a3ee774087dabe52f7422/NanoSV-1.1.0.tar.gz" } ], "1.1.1": [ { "comment_text": "", "digests": { "md5": "ad77c5d479ed7d6e15f64678224c17b4", "sha256": "a1396d1aae5e09d6954249fd99ae07fc0f4d1ce949cc339c2fdcbe8d5c91ce0e" }, "downloads": -1, "filename": "NanoSV-1.1.1.tar.gz", "has_sig": false, "md5_digest": "ad77c5d479ed7d6e15f64678224c17b4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5816701, "upload_time": "2018-03-21T07:50:15", "url": "https://files.pythonhosted.org/packages/0f/f0/c00367216df57d7b73170aff1db4fdf5228057a6eb7dab6d123852d4cd13/NanoSV-1.1.1.tar.gz" } ], "1.1.2": [ { "comment_text": "", "digests": { "md5": "18811745724c6e742a94ffcf904a51af", "sha256": "ba51c4ce9604bd63d72fb237c8c662cd5b64e0d5ac8fbd06aa82d2e30fcc59f4" }, "downloads": -1, "filename": "NanoSV-1.1.2.tar.gz", "has_sig": false, "md5_digest": "18811745724c6e742a94ffcf904a51af", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5816698, "upload_time": "2018-03-21T09:28:33", "url": "https://files.pythonhosted.org/packages/c4/47/57031f4b6915df4c228945001e16d867616f579dde345306d72ae4dcb51f/NanoSV-1.1.2.tar.gz" } ], "1.1.4": [ { "comment_text": "", "digests": { "md5": "7b4a4265094051fd1f677998586cfd1a", "sha256": "8d6a38a03be469069a893d85198f940812bce0f443616f7e0caeafdea3589524" }, "downloads": -1, "filename": "NanoSV-1.1.4.tar.gz", "has_sig": false, "md5_digest": "7b4a4265094051fd1f677998586cfd1a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5816840, "upload_time": "2018-05-07T08:42:46", "url": "https://files.pythonhosted.org/packages/86/18/98ff72fe76ba7b252285b76231fddaaf6192b6608d946ca905f59c6d994e/NanoSV-1.1.4.tar.gz" } ], "1.1.5": [ { "comment_text": "", "digests": { "md5": "b0dbf8772300a9e538cc1b51ef4547f2", "sha256": "2b4aec43f0105e15b5d2656f8740dc1ac709f384cf67b5243c790aa5dbe34df3" }, "downloads": -1, "filename": "NanoSV-1.1.5.tar.gz", "has_sig": false, "md5_digest": "b0dbf8772300a9e538cc1b51ef4547f2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5816840, "upload_time": "2018-05-07T09:35:09", "url": "https://files.pythonhosted.org/packages/8e/64/5738c9f33607fec0b48038868593930a4cccc9287954251ba4b9c70f07d2/NanoSV-1.1.5.tar.gz" } ], "1.1.6": [ { "comment_text": "", "digests": { "md5": "848308156986f3f682dece9ac0bcdd39", "sha256": "49eaddff0e44778fffed46db94fc3e66416b805dc940992ff40d9b59a347c5b9" }, "downloads": -1, "filename": "NanoSV-1.1.6.tar.gz", "has_sig": false, "md5_digest": "848308156986f3f682dece9ac0bcdd39", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5816860, "upload_time": "2018-05-09T11:37:27", "url": "https://files.pythonhosted.org/packages/72/fa/17c8c6b1a291728b5cda4bc8fe37be81cdd0dc626c3d1fa5e454f0cd18a9/NanoSV-1.1.6.tar.gz" } ], "1.2.0": [ { "comment_text": "", "digests": { "md5": "3e90b3da070920da47bae00965cecfe2", "sha256": "add65e53aacfc396835d752ba1c1431dfb20f058a9da2f1913aec40048199e80" }, "downloads": -1, "filename": "NanoSV-1.2.0.tar.gz", "has_sig": false, "md5_digest": "3e90b3da070920da47bae00965cecfe2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5821227, "upload_time": "2018-06-21T06:52:09", "url": "https://files.pythonhosted.org/packages/42/f8/48ea99c79d98b19b4d91fda2cfdfbd596510ae0a1205a706a31221dfab78/NanoSV-1.2.0.tar.gz" } ], "1.2.1": [ { "comment_text": "", "digests": { "md5": "3bdd79ea4f791c7f7604664dc63bb497", "sha256": "6dd61d7675ce394e21d62b3165e38ada3f7217d74b848d606dce6adc9ec8724f" }, "downloads": -1, "filename": "NanoSV-1.2.1.tar.gz", "has_sig": false, "md5_digest": "3bdd79ea4f791c7f7604664dc63bb497", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5821263, "upload_time": "2018-08-09T05:50:13", "url": "https://files.pythonhosted.org/packages/2d/ee/60d763de9e8b3e91336a601d2c328b786fa8fbc86980e4bd6c890d4c7a35/NanoSV-1.2.1.tar.gz" } ], "1.2.2": [ { "comment_text": "", "digests": { "md5": "e38e0735bec21492606e4d9c9c6a479a", "sha256": "d8d858b8cc7f23341f91abb2be1d3b48586c7b10ad9dbf68fb735bdda1e7ab69" }, "downloads": -1, "filename": "NanoSV-1.2.2.tar.gz", "has_sig": false, "md5_digest": "e38e0735bec21492606e4d9c9c6a479a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5821266, "upload_time": "2018-08-10T06:02:18", "url": "https://files.pythonhosted.org/packages/c5/44/ffed2b39530f05146781ea58e42d6a9ba86cf914c79684a4444f2131873b/NanoSV-1.2.2.tar.gz" } ], "1.2.3": [ { "comment_text": "", "digests": { "md5": "078c307281efaced15e5754df4ef22e4", "sha256": "d1f2d89cece444007bff0be88a22434cbdab75a3de560d188f004ad4b5ae3c8b" }, "downloads": -1, "filename": "NanoSV-1.2.3.tar.gz", "has_sig": false, "md5_digest": "078c307281efaced15e5754df4ef22e4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5821447, "upload_time": "2018-11-20T08:32:28", "url": "https://files.pythonhosted.org/packages/b6/cc/5cc6c58e2c11631ddc8ecabd7d2cdb5e12e08a40f967886531049da9a4e8/NanoSV-1.2.3.tar.gz" } ], "1.2.4": [ { "comment_text": "", "digests": { "md5": "d085631382e984580f503c3b67747dbc", "sha256": "af1b3671d862008b4631c6338fc7dbe3d6a11d4412c55eba4334f305a46a82f2" }, "downloads": -1, "filename": "NanoSV-1.2.4.tar.gz", "has_sig": false, "md5_digest": "d085631382e984580f503c3b67747dbc", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 5821426, "upload_time": "2019-10-03T08:05:36", "url": "https://files.pythonhosted.org/packages/d6/b7/550003d321b35a25147b6d8a90cf550af59171701804193f85e39ca3cc89/NanoSV-1.2.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d085631382e984580f503c3b67747dbc", "sha256": "af1b3671d862008b4631c6338fc7dbe3d6a11d4412c55eba4334f305a46a82f2" }, "downloads": -1, "filename": "NanoSV-1.2.4.tar.gz", "has_sig": false, "md5_digest": "d085631382e984580f503c3b67747dbc", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 5821426, "upload_time": "2019-10-03T08:05:36", "url": "https://files.pythonhosted.org/packages/d6/b7/550003d321b35a25147b6d8a90cf550af59171701804193f85e39ca3cc89/NanoSV-1.2.4.tar.gz" } ] }