{ "info": { "author": "Kristoffer Sahlin", "author_email": "kxs624@psu.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "isONclust\n===========\n\nisONclust is a tool for clustering either PacBio Iso-Seq reads, or Oxford Nanopore reads into clusters, where each cluster represents all reads that came from a gene. Output is a tsv file with each read assigned to a cluster-ID. Detailed information is available in [preprint](https://www.biorxiv.org/content/early/2018/11/06/463463). \n\n\nisONclust is distributed as a python package supported on Linux / OSX with python v>=3.4 as of version 0.0.2 and above (due to updates in python's multiprocessing library). [![Build Status](https://travis-ci.org/ksahlin/isONclust.svg?branch=master)](https://travis-ci.org/ksahlin/isONclust).\n\nTable of Contents\n=================\n\n * [INSTALLATION](#INSTALLATION)\n * [Using conda](#Using-conda)\n * [Using pip](#Using-pip)\n * [Downloading source from GitHub](#Downloading-source-from-github)\n * [Dependencies](#Dependencies)\n * [Testing installation](#testing-installation)\n * [USAGE](#USAGE)\n * [Iso-Seq](#Iso-Seq)\n * [Oxford Nanopore](#Oxford-Nanopore)\n * [Output](#Output)\n * [Parameters](#Parameters)\n * [CREDITS](#CREDITS)\n * [LICENCE](#LICENCE)\n\n\n\nINSTALLATION\n----------------\n\n### Using conda\nConda is the preferred way to install isONclust.\n\n1. Create and activate a new environment called isonclust\n\n```\nconda create -n isonclust python=3 pip \nsource activate isonclust\n```\n\n2. Install isONclust \n\n```\npip install isONclust\n```\n3. You should now have 'isONclust' installed; try it:\n```\nisONclust --help\n```\n\nUpon start/login to your server/computer you need to activate the conda environment \"isonclust\" to run isONclust as:\n```\nsource activate isonclust\n```\n\n### Using pip \n\nTo install isONclust, run:\n```\npip install isONclust\n```\n`pip` will install the dependencies automatically for you. `pip` is pythons official package installer and is included in most python versions. If you do not have `pip`, it can be easily installed [from here](https://pip.pypa.io/en/stable/installing/) and upgraded with `pip install --upgrade pip`. \n\n\n### Downloading source from GitHub\n\n#### Dependencies\n\nMake sure the below listed dependencies are installed (installation links below). Versions in parenthesis are suggested as isONclust has not been tested with earlier versions of these libraries. However, isONclust may also work with earliear versions of these libaries.\n* [parasail](https://github.com/jeffdaily/parasail-python)\n* [pysam](http://pysam.readthedocs.io/en/latest/installation.html) (>= v0.11)\n\n\nWith these dependencies installed. Run\n\n```sh\ngit clone https://github.com/ksahlin/isONclust.git\ncd isONclust\n./isONclust\n```\n\n### Testing installation\n\nYou can verify successul installation by running isONclust on this [small dataset](https://github.com/ksahlin/isONclust/tree/master/test/sample_alz_2k.fastq). Simply download the test dataset and run:\n\n```\nisONclust --fastq [test/sample_alz_2k.fastq] --outfolder [output path]\n```\n\n\nUSAGE\n-------\n\nIsONclust can be used with either Iso-Seq or ONT reads. It takes either a fastq file or ccs.bam file. \n\n\n\n### Iso-Seq\n\nIsONclust works with full-lengh non-chimeric (_flnc_) reads that has quality values assigned to bases. The flnc reads with quality values can be generated as follows:\n\n1. Make sure quality values is output when running the circular consensus calling step (CCS), by running `ccs` with the parameter `--polish`.\n2. Run PacBio's Iso-Seq pipeline step 2 and 3 (primer removal and extraction of flnc reads) [isoseq3](https://github.com/PacificBiosciences/IsoSeq3/blob/master/README_v3.1.md). \n\nFlnc reads can be submitted as either a fastq file or bam file. A fastq file is created from a BAM by running _e.g_ `bamtools convert -format fastq -in flnc.bam -out flnc.fastq`. isONclust is called as follows\n\n```\nisONclust pipeline --isoseq --fastq --outfolder \n```\n\nisONclust also supports older versions of the isoseq3 pipeline by taking the `ccs.bam` file together with the `flnc.bam`. In this case, isONclust can be run as follows. \n\n\n```\nisONclust --isoseq --ccs --flnc --outfolder \n```\nWhere `` is the file generated from `ccs` and `` is the file generated from `isoseq3 cluster`. The argument `--isoseq` simply means `--k 15 --w 50`. These arguments can be set manually without the `--isoseq` flag. Specify number of cores with `--t`. \n\n\n### Oxford Nanopore\nisONclust needs a fastq file generated by an Oxford Nanopore basecaller.\n\n```\nisONclust pipeline --ont --fastq --outfolder \n```\nThe argument `--ont` simply means `--k 13 --w 20`. These arguments can be set manually without the `--ont` flag. Specify number of cores with `--t`. \n\n#### Output\n\nThe output consists of a tsv file `final_clusters.tsv` present in the specified output folder. In this file, the first column is the cluster ID and the second column is the read accession. For example:\n```\n0 read_X_acc\n0 read_Y_acc\n...\nn read_Z_acc\n```\nif there are n reads there will be n rows. Some reads might be singletons. The rows are ordered with respect to the size of the cluster (largest first).\n\n\n\n\nCREDITS\n----------------\n\nPlease cite [1] when using isONclust.\n\n1. Kristoffer Sahlin, Paul Medvedev (2019) \"De Novo Clustering of Long-Read Transcriptome Data Using a Greedy, Quality-Value Based Algorithm\", RECOMB 2019 [Link](https://link.springer.com/chapter/10.1007/978-3-030-17083-7_14).\n\nBib record: \n\n@InProceedings{10.1007/978-3-030-17083-7_14,\nauthor=\"Sahlin, Kristoffer\nand Medvedev, Paul\",\neditor=\"Cowen, Lenore J.\",\ntitle=\"De Novo Clustering of Long-Read Transcriptome Data Using a Greedy, Quality-Value Based Algorithm\",\nbooktitle=\"Research in Computational Molecular Biology\",\nyear=\"2019\",\npublisher=\"Springer International Publishing\",\naddress=\"Cham\",\npages=\"227--242\",\nabstract=\"Long-read sequencing of transcripts with PacBio Iso-Seq and Oxford Nanopore Technologies has proven to be central to the study of complex isoform landscapes in many organisms. However, current de novo transcript reconstruction algorithms from long-read data are limited, leaving the potential of these technologies unfulfilled. A common bottleneck is the dearth of scalable and accurate algorithms for clustering long reads according to their gene family of origin. To address this challenge, we develop isONclust, a clustering algorithm that is greedy (in order to scale) and makes use of quality values (in order to handle variable error rates). We test isONclust on three simulated and five biological datasets, across a breadth of organisms, technologies, and read depths. Our results demonstrate that isONclust is a substantial improvement over previous approaches, both in terms of overall accuracy and/or scalability to large datasets. Our tool is available at https://github.com/ksahlin/isONclust.\",\nisbn=\"978-3-030-17083-7\"\n}\n\nLICENCE\n----------------\n\nGPL v3.0, see [LICENSE.txt](https://github.com/ksahlin/isONclust/blob/master/LICENCE.txt).\n\n\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ksahlin/isONclust", "keywords": "Iso-Seq CCS PacBio Oxford Nanopore Technologies transcript long-read", "license": "", "maintainer": "", "maintainer_email": "", "name": "isONclust", "package_url": "https://pypi.org/project/isONclust/", "platform": "", "project_url": "https://pypi.org/project/isONclust/", "project_urls": { "Homepage": "https://github.com/ksahlin/isONclust" }, "release_url": "https://pypi.org/project/isONclust/0.0.5/", "requires_dist": [ "pysam (>=0.11)", "parasail (>=1.1.11)" ], "requires_python": "!=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "summary": "De novo clustering of long-read transcriptome reads.", "version": "0.0.5" }, "last_serial": 5967337, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "a6662a910ea897f5540ba9d0a64fe59a", "sha256": "d4dd8a3dff2e885a96bee5a080504fd20d51f8be36ca7fb41d53157427b85e44" }, "downloads": -1, "filename": "isONclust-0.0.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "a6662a910ea897f5540ba9d0a64fe59a", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 552155, "upload_time": "2018-11-05T23:18:49", "url": "https://files.pythonhosted.org/packages/c5/9e/6ba6db056665f9485040f142c6320bcae49b7459d20a97990b4ca18a10b0/isONclust-0.0.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b108e5651e80d42301b7823b08c5dc9a", "sha256": "e28a02b6c2cc52e7ccb3d23563cac71b419a3238d2ad34385b2d121954dad647" }, "downloads": -1, "filename": "isONclust-0.0.1.tar.gz", "has_sig": false, "md5_digest": "b108e5651e80d42301b7823b08c5dc9a", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 546358, "upload_time": "2018-11-05T23:18:54", "url": "https://files.pythonhosted.org/packages/f3/1c/5da5b4b4fd5fb325716acf0ecff832061ed96da2da5673418d801520bbaa/isONclust-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "bc2b239c4bd27f2c4b546a42482e594b", "sha256": "30c4a9da4450bd4907d161369d7419b2838c1701e95b456b23f237936cf1bcc3" }, "downloads": -1, "filename": "isONclust-0.0.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "bc2b239c4bd27f2c4b546a42482e594b", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 560268, "upload_time": "2018-12-23T13:49:00", "url": "https://files.pythonhosted.org/packages/4a/93/55461310577113adf80be3522e5c06c9e61917aa79a7590b37d262c5377c/isONclust-0.0.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4cba156b020d2ebd8a36b28fea44cda1", "sha256": "14746a00d58156bc40019670b18725db239b522da640d2657ddc9646d53a5bbe" }, "downloads": -1, "filename": "isONclust-0.0.2.tar.gz", "has_sig": false, "md5_digest": "4cba156b020d2ebd8a36b28fea44cda1", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 552475, "upload_time": "2018-12-23T13:49:04", "url": "https://files.pythonhosted.org/packages/70/12/bad4b579efa6a0e24bb2a61d6bcd2a64a04977fb65f630166fdcd3516aaa/isONclust-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "73a593fe9ef490ca1d37cb593adbbc9d", "sha256": "aa26264771895714babb889bb17cbb05b3dd7156a671d94204673538aa4916b9" }, "downloads": -1, "filename": "isONclust-0.0.3.tar.gz", "has_sig": false, "md5_digest": "73a593fe9ef490ca1d37cb593adbbc9d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 553707, "upload_time": "2019-01-15T20:38:20", "url": "https://files.pythonhosted.org/packages/73/cf/0c64a6b7100bf48dacb301125ecdca2ea28e4d05a6f4fc65205bea658a49/isONclust-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "faea854dc87aad87b22c63249ec1ad6d", "sha256": "ad48b951c436de653ae0e692c102fbe669124124cfbb2e8f2cf5f04cf6cdf5ac" }, "downloads": -1, "filename": "isONclust-0.0.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "faea854dc87aad87b22c63249ec1ad6d", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 560715, "upload_time": "2019-02-28T14:53:21", "url": "https://files.pythonhosted.org/packages/c8/c1/49a87869fd80f7d454c93adbd5358a2d4e321102ef5aec00872291cf2190/isONclust-0.0.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9d6643271eeaec6af0d95557d95b5c4e", "sha256": "b0e034786349764f263ffca26f69d9c4c7eb7f3051d706098d81a5877896b5db" }, "downloads": -1, "filename": "isONclust-0.0.4.tar.gz", "has_sig": false, "md5_digest": "9d6643271eeaec6af0d95557d95b5c4e", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 545380, "upload_time": "2019-02-28T14:53:27", "url": "https://files.pythonhosted.org/packages/b9/7e/f7ffe4c52e0c6eb6a1d2c781286baba0dcb19d186b3d4b6e92ffe9ffdf59/isONclust-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "142c7c6d6c06e653f74c352782529c95", "sha256": "5fd8e6f2d09a7262911904f156191e9867069127aea7ee317f9b378263cdbcbf" }, "downloads": -1, "filename": "isONclust-0.0.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "142c7c6d6c06e653f74c352782529c95", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": "!=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 573626, "upload_time": "2019-10-13T13:02:43", "url": "https://files.pythonhosted.org/packages/71/1c/e6350e19e87146037855b5a6f558f88cf519f3ddcce1396739ae73d80aae/isONclust-0.0.5-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "58ab09acec7d5e42405a2b53038d8739", "sha256": "f5da8443794f9c6f59f7421542f99c380ebcd5038a21c75296a598aee07bbf5d" }, "downloads": -1, "filename": "isONclust-0.0.5.tar.gz", "has_sig": false, "md5_digest": "58ab09acec7d5e42405a2b53038d8739", "packagetype": "sdist", "python_version": "source", "requires_python": "!=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 542602, "upload_time": "2019-10-13T13:02:46", "url": "https://files.pythonhosted.org/packages/89/6c/332f0027e11b8747004e3c255e3c7ac6ac4473f4d5538697d301b74097c7/isONclust-0.0.5.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "142c7c6d6c06e653f74c352782529c95", "sha256": "5fd8e6f2d09a7262911904f156191e9867069127aea7ee317f9b378263cdbcbf" }, "downloads": -1, "filename": "isONclust-0.0.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "142c7c6d6c06e653f74c352782529c95", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": "!=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 573626, "upload_time": "2019-10-13T13:02:43", "url": "https://files.pythonhosted.org/packages/71/1c/e6350e19e87146037855b5a6f558f88cf519f3ddcce1396739ae73d80aae/isONclust-0.0.5-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "58ab09acec7d5e42405a2b53038d8739", "sha256": "f5da8443794f9c6f59f7421542f99c380ebcd5038a21c75296a598aee07bbf5d" }, "downloads": -1, "filename": "isONclust-0.0.5.tar.gz", "has_sig": false, "md5_digest": "58ab09acec7d5e42405a2b53038d8739", "packagetype": "sdist", "python_version": "source", "requires_python": "!=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4", "size": 542602, "upload_time": "2019-10-13T13:02:46", "url": "https://files.pythonhosted.org/packages/89/6c/332f0027e11b8747004e3c255e3c7ac6ac4473f4d5538697d301b74097c7/isONclust-0.0.5.tar.gz" } ] }