{
    "info": {
        "author": "",
        "author_email": "",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 4 - Beta",
            "Intended Audience :: Developers",
            "License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
            "Programming Language :: Python :: 2",
            "Programming Language :: Python :: 2.7",
            "Programming Language :: Python :: 3",
            "Programming Language :: Python :: 3.3",
            "Programming Language :: Python :: 3.4",
            "Programming Language :: Python :: 3.5",
            "Programming Language :: Python :: 3.6",
            "Topic :: Software Development :: Build Tools"
        ],
        "description": "NanoSim-H\n=========\n\n.. image:: https://travis-ci.org/karel-brinda/NanoSim-H.svg?branch=master\n\t:target: https://travis-ci.org/karel-brinda/NanoSim-H\n\n.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square\n\t:target: https://anaconda.org/bioconda/nanosim-h\n\n.. image:: https://badge.fury.io/py/NanoSim-H.svg\n\t:target: https://badge.fury.io/py/NanoSim-H\n\n\nAbout\n-----\n\nNanoSim-H is a simulator of Oxford Nanopore reads that captures the technology-specific features of ONT data,\nand allows for adjustments upon improvement of Nanopore sequencing technology.\nNanoSim-H has been derived from `NanoSim <https://github.com/bcgsc/NanoSim>`_,\na software package developed by Chen Yang at `Canada's Michael Smith Genome Sciences Centre <http://www.bcgsc.ca/>`_.\nThe fork was created from version 1.0.1 and the versions of NanoSim-H and NanoSim are kept synchronized.\n\nNanoSim-H is implemented using Python uses R for model fitting.\nIn silico reads can be simulated from a given reference genome using ``nanosim-h``.\nThe NanoSim-H package is distributed with several precomputed error profiles, but\nadditional profiles can be computed using the ``nanosim-h-train``.\n\nThe main improvements compared to NanoSim are:\n\n* Support for Python 3\n* Support for `RNF <https://www.ncbi.nlm.nih.gov/pubmed/26353839>`_ read names\n* Installation from `PyPI <https://pypi.python.org/pypi/NanoSim-H/>`_\n* Error profiles distributed with the main package\n* Automatic testing using `Travis <https://travis-ci.org/karel-brinda/NanoSim-H>`_\n* Reproducible simulations (setting a seed for PRG)\n* Improved interface with new parameters (e.g., for merging all contigs) and a progress bar\n* Several minor bugs fixed\n\n\n\nQuick example\n-------------\n\nSimulation of 100 reads from an *E.coli genome*.\n\n.. code-block:: bash\n\n\tpip install --upgrade nanosim-h\n\tcurl \"https://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?db=nuccore&dopt=fasta&val=545778205&sendto=on\" | \\\n\t\tnanosim-h -n 100 -\n\n\n\nInstallation\n------------\n\n**From** `BioConda <https://bioconda.github.io/>`_ **(recommended):**\n\n\n.. code-block:: bash\n\n\tconda config --add channels defaults\n\tconda config --add channels conda-forge\n\tconda config --add channels bioconda\n\tconda install -y nanosim-h\n\n**From** `PyPI <https://pypi.python.org/pypi/NanoSim-H/>`_ **:**\n\n.. code-block:: bash\n\n\tpip install --upgrade nanosim-h\n\n**From Github:**\n\n.. code-block:: bash\n\n\tgit clone https://github.com/karel-brinda/nanosim-h\n\tcd nanosim-h\n\tpip install --upgrade .\n\nor\n\n.. code-block:: bash\n\n\tgit clone https://github.com/karel-brinda/nanosim-h\n\tcd nanosim-h\n\tpython setup.py install\n\n\n**Dependencies:**\n\nFor read simulation:\n\n* `Python <http://python.org>`_ (2.7, 3.2 - 3.6)\n* `Numpy <http://www.numpy.org/>`_\n\nFor computing new error profiles:\n\n* `LAST <http://last.cbrc.jp/>`_ (tested with version 847)\n* `R <https://www.r-project.org/>`_\n\nWhen installed using Bioconda, all NanoSim-H dependencies get installed automatically.\nWhen installed using PIP, all dependencies for read simulation are installed automatically.\n\n\nRead simulation\n---------------\n\nSimulation stage takes a reference genome and possibly a read profile as input, and outputs simulated reads in FASTA format.\n\n\n.. command: nanosim-h --help\n\n.. code-block::\n\n\t$ nanosim-h --help\n\tusage: nanosim-h [-h] [-v] [-p str] [-o str] [-n int] [-u float] [-m float]\n\t                 [-i float] [-d float] [-s int] [--circular] [--perfect]\n\t                 [--merge-contigs] [--rnf] [--rnf-add-cigar] [--max-len int]\n\t                 [--min-len int] [--kmer-bias int]\n\t                 <reference.fa>\n\t\n\tProgram:  NanoSim-H - a simulator of Oxford Nanopore reads.\n\tVersion:  1.1.0.4\n\tAuthors:  Chen Yang <cheny@bcgsc.ca> - author of the original software package (NanoSim)\n\t          Karel Brinda <kbrinda@hsph.harvard.edu> - author of the NanoSim-H fork\n\t\n\tpositional arguments:\n\t  <reference.fa>        reference genome (- for standard input)\n\t\n\toptional arguments:\n\t  -h, --help            show this help message and exit\n\t  -v, --version         show program's version number and exit\n\t  -p str, --profile str\n\t                        error profile - one of precomputed profiles\n\t                        ('ecoli_R7.3', 'ecoli_R7', 'ecoli_R9_1D',\n\t                        'ecoli_R9_2D', 'yeast', 'ecoli_UCSC1b') or own\n\t                        directory with an error profile [ecoli_R9_2D]\n\t  -o str, --out-pref str\n\t                        prefix of output file [simulated]\n\t  -n int, --number int  number of generated reads [10000]\n\t  -u float, --unalign-rate float\n\t                        rate of unaligned reads [detect from the error\n\t                        profile]\n\t  -m float, --mis-rate float\n\t                        mismatch rate (weight tuning) [1.0]\n\t  -i float, --ins-rate float\n\t                        insertion rate (weight tuning) [1.0]\n\t  -d float, --del-rate float\n\t                        deletion rate (weight tuning) [1.0]\n\t  -s int, --seed int    initial seed for the pseudorandom number generator (0\n\t                        for random) [42]\n\t  --circular            circular simulation (linear otherwise)\n\t  --perfect             output perfect reads, no mutations\n\t  --merge-contigs       merge contigs from the reference\n\t  --rnf                 use RNF format for read names\n\t  --rnf-add-cigar       add cigar to RNF names (not fully debugged, yet)\n\t  --max-len int         maximum read length [inf]\n\t  --min-len int         minimum read length [50]\n\t  --kmer-bias int       prohibits homopolymers with length >= n bases in\n\t                        output reads [6]\n\t\n\tExamples: nanosim-h --circular ecoli_ref.fasta\n\t          nanosim-h --circular --perfect ecoli_ref.fasta\n\t          nanosim-h -p yeast --kmer-bias 0 yeast_ref.fasta\n\t\n\tNotice: the use of `max-len` and `min-len` will affect the read length distributions. If\n\tthe range between `max-len` and `min-len` is too small, the program will run slowlier accordingly.\n\t\n\n.. end\n\n\n**Examples:**\n\n1. If you want to simulate reads from *E. coli* genome, then circular mode should be used because it is a circular genome.\n\n\t``nanosim-h --circular Ecoli_ref.fasta``\n\n2. If you want to simulate only perfect reads, i.e. no SNPs, or indels, just simulate the read length distribution.\n\n\t``nanosimh-h --circular --perfect Ecoli_ref.fasta``\n\n3. If you want to simulate reads from a *S. cerevisiae* genome with no *k*-mer bias, then linear mode should be chosen because it is a linear genome.\n\n\t``nanosimh-h -p yeast --kmer-bias 0 yeast_ref.fasta``\n\n\n**Output files:**\n\n1. ``simulated.log`` \u2013 Log file for simulation process.\n\n2. ``simulated.fa`` \u2013 FASTA file of simulated reads. Reads can contain information about how they were created either in RNF, or in the original NanoSim naming convention.\n\n        **RNF naming convention**\n\n        See the associated `RNF paper <https://www.ncbi.nlm.nih.gov/pubmed/26353839/>`_ and `RNF specification <http://karel-brinda.github.io/rnf-spec/>`_.\n\n        **NanoSim naming convention**\n\n\tEach reads has \"unaligned\", \"aligned\", or \"perfect\" in the header determining their error rate. \"unaligned\" means that the reads have an error rate over 90% and cannot be aligned. \"aligned\" reads have the same error rate as training reads. \"perfect\" reads have no errors.\n\n\tTo explain the information in the header, we have two examples:\n\n\t* ``>ref|NC-001137|-[chromosome=V]_468529_unaligned_0_F_0_3236_0``\n\t\tAll information before the first ``_`` are chromosome information. ``468529`` is the start position and *unaligned* suggesting it should be unaligned to the reference. The first ``0`` is the sequence index. ``F`` represents a forward strand. ``0_3236_0`` means that sequence length extracted from the reference is 3236 bases.\n\t* ``>ref|NC-001143|-[chromosome=XI]_115406_aligned_16565_R_92_12710_2``\n\t\tThis is an aligned read coming from chromosome XI at position 115406. ``16565`` is the sequence index. `R` represents a reverse complement strand. ``92_12710_2`` means that this read has 92-base head region (cannot be aligned), followed by 12710 bases of middle region, and then 2-base tail region.\n\n\tThe information in the header can help users to locate the read easily.\n\n3. ``simulated.errors.txt`` \u2013 List of introduced errors.\n\n\tThe output contains error type, position, original bases and current bases.\n\n\nError profiles\n--------------\n\nCharacterization stage takes a reference and a training read set in FASTA format as input. User can also provide their own alignment file in MAF format.\n\n\n**Profiles distributed with NanoSim-H:**\n\n* ``ecoli_R7``\n* ``ecoli_R7.3``\n* ``ecoli_R9_1D``\n* ``ecoli_R9_2D`` (default error profile for read simulation)\n* ``ecoli_UCSC1b``\n* ``yeast``\n\n**New error profiles:**\n\nA new error profile can be obtained using the ``nanosim-h-train`` command.\n\n.. command: nanosim-h-train --help\n\n.. code-block::\n\n\t$ nanosim-h-train --help\n\tusage: nanosim-h-train [-h] [-v] [-i str] [-m str] [-b int] [--no-model-fit]\n\t                       <reference.fa> <profile.dir>\n\t\n\tProgram:  NanoSim-H-Train - compute an error profile for NanoSim-H.\n\tVersion:  1.1.0.4\n\tAuthors:  Chen Yang <cheny@bcgsc.ca> - author of the original software package (NanoSim)\n\t          Karel Brinda <kbrinda@hsph.harvard.edu> - author of the NanoSim-H fork\n\t\n\tpositional arguments:\n\t  <reference.fa>        reference genome of the training reads\n\t  <profile.dir>         error profile dir\n\t\n\toptional arguments:\n\t  -h, --help            show this help message and exit\n\t  -v, --version         show program's version number and exit\n\t  -i str, --infile str  training ONT real reads, must be fasta files\n\t  -m str, --maf str     user can provide their own alignment file, with maf\n\t                        extension\n\t  -b int, --num-bins int\n\t                        number of bins (for development) [20]\n\t  --no-model-fit        no model fitting\n\t\n\n.. end\n\n**Files associated with an error profile:**\n\n1. ``aligned_length_ecdf`` \u2013 Length distribution of aligned regions on aligned reads.\n2. ``aligned_reads_ecdf`` \u2013 Length distribution of aligned reads.\n3. ``align_ratio`` \u2013 Empirical distribution of align ratio of each read.\n4. ``besthit.maf`` \u2013 The best alignment of each read based on length.\n5. ``match.hist``, ``mis.hist``, ``ins.hist``, ``del.hist`` \u2013 Histograms of matches, mismatches, insertions, and deletions.\n6. ``first_match.hist`` \u2013 Histogram of the first match length of each alignment.\n7. ``error_markov_model`` \u2013 Markov model of error types.\n8. ``ht_ratio`` \u2013 Empirical distribution of the head region vs total unaligned region.\n9. ``training.maf`` \u2013 The output of LAST, alignment file in MAF format.\n10. ``match_markov_model`` \u2013 Markov model of the length of matches (stretches of correct base calls).\n11. ``model_profile`` \u2013 Fitted model for errors.\n12. ``processed.maf`` \u2013 A re-formatted MAF file for user-provided alignment file.\n13. ``unaligned_length_ecdf`` \u2013 Length distribution of unaligned reads",
        "description_content_type": "",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/karel-brinda/nanosim-h",
        "keywords": "Nanopore simulation",
        "license": "GPLv3",
        "maintainer": "",
        "maintainer_email": "",
        "name": "NanoSim-H",
        "package_url": "https://pypi.org/project/NanoSim-H/",
        "platform": "",
        "project_url": "https://pypi.org/project/NanoSim-H/",
        "project_urls": {
            "Homepage": "https://github.com/karel-brinda/nanosim-h"
        },
        "release_url": "https://pypi.org/project/NanoSim-H/1.1.0.4/",
        "requires_dist": null,
        "requires_python": "",
        "summary": "",
        "version": "1.1.0.4"
    },
    "last_serial": 4145302,
    "releases": {
        "1.1.0.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "9f74b3518acf8679eed8e4fb96e76553",
                    "sha256": "8565bf378d0fe6e38399d8baa624ccd8f25a6454aa57aee78fbac4ae932254b9"
                },
                "downloads": -1,
                "filename": "NanoSim-H-1.1.0.0.tar.gz",
                "has_sig": false,
                "md5_digest": "9f74b3518acf8679eed8e4fb96e76553",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 688305,
                "upload_time": "2017-05-10T18:53:54",
                "url": "https://files.pythonhosted.org/packages/e0/af/3afd825eedf07b4370ece52a0ba81f55c8cecdf70ba4951c486c2427bc51/NanoSim-H-1.1.0.0.tar.gz"
            }
        ],
        "1.1.0.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "2f4f94dd5e1458c825442c971cc14df6",
                    "sha256": "91bf1b5203225af1ae83d07a91353a748948e6c6caf99fdb001e58c08f6c7986"
                },
                "downloads": -1,
                "filename": "NanoSim-H-1.1.0.1.tar.gz",
                "has_sig": false,
                "md5_digest": "2f4f94dd5e1458c825442c971cc14df6",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 688324,
                "upload_time": "2017-05-10T19:00:24",
                "url": "https://files.pythonhosted.org/packages/2c/13/a33d0e4c29b70ff1189bc6a56418c0f3cd151c49d6aaf551f7ca1ae1995c/NanoSim-H-1.1.0.1.tar.gz"
            }
        ],
        "1.1.0.2": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "6c4da7a548a047b9c7b409ed60afd6b2",
                    "sha256": "c714fb90f0c8e2672540e789ab0a9e797e2385372f45ef85cd8b82aee5fc1a29"
                },
                "downloads": -1,
                "filename": "NanoSim-H-1.1.0.2.tar.gz",
                "has_sig": false,
                "md5_digest": "6c4da7a548a047b9c7b409ed60afd6b2",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 688326,
                "upload_time": "2017-05-10T19:02:29",
                "url": "https://files.pythonhosted.org/packages/60/ed/b0e9384cc59ba52e22b62b7517b356bafc18fafa64ae76ca5194745e92b2/NanoSim-H-1.1.0.2.tar.gz"
            }
        ],
        "1.1.0.3": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "f1ddee7cbe95bf1efedc437d5c54ad80",
                    "sha256": "de82ad6ee2b2fabd2cb513bfb353be6ec4e7d9d81681d78cca6ec1af4812a27a"
                },
                "downloads": -1,
                "filename": "NanoSim-H-1.1.0.3.tar.gz",
                "has_sig": false,
                "md5_digest": "f1ddee7cbe95bf1efedc437d5c54ad80",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 688265,
                "upload_time": "2017-05-16T19:09:42",
                "url": "https://files.pythonhosted.org/packages/20/32/4e5abbfb64835c62be3f4d3a5178e33a2f6aabea9cceb3e15927c90b0f39/NanoSim-H-1.1.0.3.tar.gz"
            }
        ],
        "1.1.0.4": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "68d2c6724dd0f170964383684c5e8b8c",
                    "sha256": "cd911f9b05419e164a92e5d36d8ae1e6d60641b54a8ca4bf5a179be4e89a52c6"
                },
                "downloads": -1,
                "filename": "NanoSim_H-1.1.0.4-py2.py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "68d2c6724dd0f170964383684c5e8b8c",
                "packagetype": "bdist_wheel",
                "python_version": "3.6",
                "requires_python": null,
                "size": 695168,
                "upload_time": "2018-08-07T17:55:30",
                "url": "https://files.pythonhosted.org/packages/e2/f3/a339c42515d5beec10c22e8a6c821668776714b4cdda17f0bc065c538409/NanoSim_H-1.1.0.4-py2.py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "e6bd54bd59ac87d8d9416bfe6edec8a7",
                    "sha256": "0fc5f31f3569a04d77bcf8322390301f01d4c26da67d05cf79a57b44d00e341c"
                },
                "downloads": -1,
                "filename": "NanoSim-H-1.1.0.4.tar.gz",
                "has_sig": false,
                "md5_digest": "e6bd54bd59ac87d8d9416bfe6edec8a7",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 697244,
                "upload_time": "2018-08-07T17:55:28",
                "url": "https://files.pythonhosted.org/packages/f5/44/8815be8aec318b1d77b4ce10d523081cb6975cc3c76382c8ab971d0a96eb/NanoSim-H-1.1.0.4.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "68d2c6724dd0f170964383684c5e8b8c",
                "sha256": "cd911f9b05419e164a92e5d36d8ae1e6d60641b54a8ca4bf5a179be4e89a52c6"
            },
            "downloads": -1,
            "filename": "NanoSim_H-1.1.0.4-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "68d2c6724dd0f170964383684c5e8b8c",
            "packagetype": "bdist_wheel",
            "python_version": "3.6",
            "requires_python": null,
            "size": 695168,
            "upload_time": "2018-08-07T17:55:30",
            "url": "https://files.pythonhosted.org/packages/e2/f3/a339c42515d5beec10c22e8a6c821668776714b4cdda17f0bc065c538409/NanoSim_H-1.1.0.4-py2.py3-none-any.whl"
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "e6bd54bd59ac87d8d9416bfe6edec8a7",
                "sha256": "0fc5f31f3569a04d77bcf8322390301f01d4c26da67d05cf79a57b44d00e341c"
            },
            "downloads": -1,
            "filename": "NanoSim-H-1.1.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "e6bd54bd59ac87d8d9416bfe6edec8a7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 697244,
            "upload_time": "2018-08-07T17:55:28",
            "url": "https://files.pythonhosted.org/packages/f5/44/8815be8aec318b1d77b4ce10d523081cb6975cc3c76382c8ab971d0a96eb/NanoSim-H-1.1.0.4.tar.gz"
        }
    ]
}