{ "info": { "author": "Michael Benjamin Hall", "author_email": "mbhall88@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 3.6" ], "description": "# Pistis\n\n\n### Quality control plotting for long reads.\n\n[![PyPI status](https://img.shields.io/pypi/v/pistis.svg)](https://pypi.python.org/pypi/pistis)\n[![Build Status](https://travis-ci.org/mbhall88/pistis.svg?branch=master)](https://travis-ci.org/mbhall88/pistis)\n[![GitHub license](https://img.shields.io/github/license/mbhall88/pistis.svg)](https://github.com/mbhall88/pistis/blob/master/LICENSE)\n[![Twitter Follow](https://img.shields.io/twitter/follow/mbhall88.svg?style=social&logo=twitter&label=Follow)](https://twitter.com/mbhall88) \n\nThis package provides plotting designed to give you an idea of how your long read\nsequencing data looks. It was conceived of and developed with nanopore reads in\nmind, but there is not reason why PacBio reads can't be used. \n\n\n## Installation\n\nInstallation is fairly straightforward\n\n```sh\npip3 install pistis\n```\n\nYou can also use `pip` if you are running with python2. \nOr using a virtual\nenvironment manager such as [conda](https://conda.io/docs/) or\n[pipenv](https://docs.pipenv.org/) (my personal favourite). \n\nYou should now be able to run `pistis` from the command line\n```sh\npistis --help\n```\n\n## Usage\n\nThe main use case for `pistis` is as a command-line interface (CLI), but can also be\nused in an interactive way, such as with a [Jupyter Notebook](https://jupyter.org/). \n\n#### CLI Usage\nAfter installing and running the help menu you should see the following usage\noptions\n```\npistis -h\n\nUsage: pistis [OPTIONS]\n\n A package for sanity checking (quality control) your long read data.\n Feed it a fastq file and in return you will receive a PDF with four plots:\n\n 1. GC content histogram with distribution curve for sample.\n\n 2. Jointplot showing the read length vs. phred quality score for\n each read. The interior representation of this plot can be\n altered with the --kind option.\n\n 3. Box plot of the phred quality score at positional bins across\n all reads. The reads are binned into read positions 1, 2, 3, 4, 5,\n 6, 7, 8, 9, 10, 11-20, 21-50, 51-100, 101-200, 201-300. Plots from\n the start of reads.\n\n 4. Same as 3, but plots from the end of the read.\n\n Additionally, if you provide a BAM/SAM file a histogram of the read\n percent identity will be added to the report.\n\nOptions:\n -f, --fastq PATH Fastq file to plot. This can be gzipped.\n -o, --output PATH Path to save the plot PDF as. If name is not\n specified, will use the name of the fastq\n (or bam) file with .pdf extension.\n -k, --kind [kde|scatter|hex] The kind of representation to use for the\n jointplot of quality score vs read length.\n Accepted kinds are 'scatter', 'kde'\n (default), or 'hex'. For examples refer to h\n ttps://seaborn.pydata.org/generated/seaborn.\n jointplot.html\n --log_length / --no_log_length Plot the read length as a log10\n transformation on the quality vs read length\n plot\n -b, --bam PATH SAM/BAM file to produce read percent\n identity histogram from.\n -d, --downsample INTEGER Down-sample the sequence files to a given\n number of reads. Set to 0 for no\n subsampling. Default: 50000\n -h, --help Show this message and exit.\n```\n\nNote the `--downsample` option is set to 50000 by default. That is, `pistis` will \nonly plot 50000 reads (sampled from a uniform distribution). You can set this to \n0 if you want to plot every read, or select another number of your choosing. Be aware \nthat if you try to plot too many reads you may run into memory issues, so try \ndownsampling if this happens. \n\nThere are three different use cases - currently - for producing plots: \n\n**Fastq only** - This will return four plots:\n * A distribution plot of the GC content for each read.\n * A bivariate jointplot with read length on the y-axis and mean read quality\n score on the x-axis.\n * Two boxplots that show the distribution of quality scores at select positions\n and positional ranges. One plot shows the scores from the beginning of the\n read and the other from the end of the read. \n\nTo use `pistis` in this way you just need a fastq file.\n\n```sh\npistis -f /path/to/my.fastq -o /save/as/report.pdf\n```\n\nThis will save the four plots to a file called `report.pdf` in directory `/save/as/`.\nIf you don't provide a `--output/-o` option the file will be saved in the current\ndirectory with the basename of the fastq file. So in the above example it would be\nsaved as `my.pdf`. \nIf you would prefer the read lengths in the bivariate plot of read length vs.\nmean quality score then you can indicate this like so\n\n```sh\npistis -f /path/to/my.fastq -o /save/as/report.pdf --no_log_length\n```\n\nAdditionally, you can change the way the data is represented in the bivariate plot.\nThe default is a kernel density estimation plot (as in the below image), however you can\nchoose to use a [hex bin or scatter plot version instead](https://seaborn.pydata.org/generated/seaborn.jointplot.html).\nIn the running example, to use a scatter plot you would run the following\n\n```sh\npistis -f /path/to/my.fastq -o /save/as/report.pdf --kind scatter\n```\n\nYou can also provide a `gzip`ed fastq file without any extra steps\n\n```sh\npistis -f /path/to/my.fastq.gz -o /save/as/report.pdf\n```\n\n**Examples** \nGC content: \n![gc content plot](https://github.com/mbhall88/pistis/blob/master/docs/imgs/pistis_gc_plot.png)\n\nRead length vs. mean read quality score: \n![read length vs quality plot](https://github.com/mbhall88/pistis/blob/master/docs/imgs/pistis_qual_v_len.png) \n\nBase quality from the start of each read: \n![base quality from start plot](https://github.com/mbhall88/pistis/blob/master/docs/imgs/pistis_qual_start.png) \n\nBase quality from the end of each read: \n![base quality from end plot](https://github.com/mbhall88/pistis/blob/master/docs/imgs/pistis_qual_end.png)\n\n---\n\n**Fastq and BAM/SAM** - This will return the above four plots, plus a distribution\nplot of each read's percent identity with the reference it is aligned to in the\n[BS]AM file. Reads which are flagged as supplementary or secondary are not included.\nThe plot also includes a dashed vertical red line indicating the median\npercent identity. \nNote: If using a BAM file, it must be sorted and indexed (i.e `.bai` file). See [`samtools`](http://www.htslib.org/doc/samtools.html)\nfor instructions on how to do this.\n\n```sh\npistis -f /path/to/my.fastq -b /path/to/my.bam -o /save/as/report.pdf\n# or\npistis -f /path/to/my.fastq -b /path/to/my.sam -o /save/as/report.pdf\n```\n\n**Example** \nDistribution of aligned read percent identity: \n![percent identity plot](https://github.com/mbhall88/pistis/blob/master/docs/imgs/pistis_perc_id.png)\n\n---\n\n**BAM/SAM only** - At this stage you will receive only the distribution\nplot of each read's percent identity with the reference it is aligned to. In a\nfuture release I aim to allow you to also get the other four fastq-only plots.\n\n```sh\npistis -b /path/to/my.bam -o /save/as/report.pdf\n```\n\nAs with the fastq-only method, if you don't provide a `--output/-o` option the file will be saved in the current\ndirectory with the basename of the [BS]AM file. So in the above example it would be\nsaved as `my.pdf`.\n\n#### Usage in a development environment\n\nIf you would like to use `pistis` within a development environment such as a\n`jupyter notebook` or just a plain ol' python shell then take a look at [this example notebook](https://github.com/mbhall88/pistis/blob/master/examples/example_usage.ipynb)\nfor all the details.\n\n## Credits\n\n* This package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and the [`audreyr/cookiecutter-pypackage` project template](https://github.com/audreyr/cookiecutter-pypackage). \n* The two test data files (fastq and BAM) that I have use in this repository were\ntaken from [Wouter De Coster's `nanotest` repository](https://github.com/wdecoster/nanotest).\n* Which in turn comes from [Nick Loman and Josh Quick](http://lab.loman.net/2017/03/09/ultrareads-for-nanopore/). \n* The example plots in this `README` were made using the entire fastq of basecalled\nreads from the experiment in that [blog on \"whale hunting\"](http://lab.loman.net/2017/03/09/ultrareads-for-nanopore/). \n* The plot for the BAM file was obtained by running `pistis` on a BAM file generated\nby mapping the fastq file to *E. coli* reference [NC_000913.3](https://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3)\nusing Heng Li's [`minimap2`](https://github.com/lh3/minimap2) and `-x map-ont` option.\n\n# Contributing\n\nIf you would like to contribute to this package you are more than welcome. \n**Please read through the [contributing guidelines](https://github.com/mbhall88/pistis/blob/master/CONTRIBUTING.rst) first**.\n\n\n=======\nHistory\n=======\n\n0.1.0 (2018-02-22)\n------------------\n\n* First release on PyPI.", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/mbhall88/pistis", "keywords": "pistis", "license": "MIT license", "maintainer": "", "maintainer_email": "", "name": "pistis", "package_url": "https://pypi.org/project/pistis/", "platform": "", "project_url": "https://pypi.org/project/pistis/", "project_urls": { "Homepage": "https://github.com/mbhall88/pistis" }, "release_url": "https://pypi.org/project/pistis/0.3.3/", "requires_dist": null, "requires_python": "", "summary": "Quality control plotting for long reads", "version": "0.3.3" }, "last_serial": 3945521, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "542a8ede65691a49f04bb7a3cf63936e", "sha256": "78e9dfd0498112637e458cfdc376ad5747f63f3fffdfc004d3404a1fd82170f6" }, "downloads": -1, "filename": "pistis-0.1.0.tar.gz", "has_sig": false, "md5_digest": "542a8ede65691a49f04bb7a3cf63936e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24946652, "upload_time": "2018-03-01T10:19:40", "url": "https://files.pythonhosted.org/packages/e1/96/3a8242aa42210846e7f0a6294547f17f0457c1ca93df4313e75905dba229/pistis-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "12f5c28f45299501c550e427fe090639", "sha256": "e0788a60b355eadc5fc6e0ecadbd723f864fd93a95b0c481ff531299ec13fd60" }, "downloads": -1, "filename": "pistis-0.1.1.tar.gz", "has_sig": false, "md5_digest": "12f5c28f45299501c550e427fe090639", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25736490, "upload_time": "2018-03-01T14:40:07", "url": "https://files.pythonhosted.org/packages/41/4a/2b8919a4433590b3d90078846f76ff534148ebd129549230dcb9cc6df039/pistis-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "6f5a096f3f231135a06d6c8ce75de932", "sha256": "5f16371aea9404e956cf51f6f37ce905c13b51352c50e9f681245c89349d7ad0" }, "downloads": -1, "filename": "pistis-0.1.2.tar.gz", "has_sig": false, "md5_digest": "6f5a096f3f231135a06d6c8ce75de932", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25739444, "upload_time": "2018-03-02T14:05:01", "url": "https://files.pythonhosted.org/packages/0b/73/f19a3e9ce9e173f872ca454fa4de431de27c435dc747a3227bf6c9f47547/pistis-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "e5763c56c9ab96e45da2db3c8e01344d", "sha256": "fb4b531f9d2a830d027bd100ea02792261e2c7616a9d3cb99e1f9ab184665f85" }, "downloads": -1, "filename": "pistis-0.1.3.tar.gz", "has_sig": false, "md5_digest": "e5763c56c9ab96e45da2db3c8e01344d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25645460, "upload_time": "2018-03-06T11:13:18", "url": "https://files.pythonhosted.org/packages/68/f0/967c6a75bb9349aee8636c7d2fb84da1c1be1965474d69d264a18d959c3c/pistis-0.1.3.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "a75585907686d9b7f8bebe6f7cdf2332", "sha256": "e4262351c8eaaee3d63086fe4756abcd5d2fc97cab4f558cd3c789bc3d07585a" }, "downloads": -1, "filename": "pistis-0.2.0.tar.gz", "has_sig": false, "md5_digest": "a75585907686d9b7f8bebe6f7cdf2332", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25726286, "upload_time": "2018-03-21T14:48:58", "url": "https://files.pythonhosted.org/packages/a5/80/c0ccb505638c4e335d757418bdd27c74267766b3d964e1a47b1da3fda0b5/pistis-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "1b4c04d04572a2730df1c5cb5eb59403", "sha256": "3fd9ae27a1ad3839ddfcf1407df42bfa8644681777ea06195966755b3bbd559d" }, "downloads": -1, "filename": "pistis-0.2.1.tar.gz", "has_sig": false, "md5_digest": "1b4c04d04572a2730df1c5cb5eb59403", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25008486, "upload_time": "2018-04-19T12:45:00", "url": "https://files.pythonhosted.org/packages/6c/db/8a22307396cc27d48a4943ef8468a78fb1d492e0c593f17ae4f4faf06263/pistis-0.2.1.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "fbc6c8aab4e6cac4fe8618642f2ee38c", "sha256": "f4874284cf5d05c22773b45e8cea7e8bccc4f08aa2a7dc3c9beb71a282439586" }, "downloads": -1, "filename": "pistis-0.3.0.tar.gz", "has_sig": false, "md5_digest": "fbc6c8aab4e6cac4fe8618642f2ee38c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25009626, "upload_time": "2018-05-13T16:04:33", "url": "https://files.pythonhosted.org/packages/69/b2/84a869a81d793714b2b567c8185eecc76718f7201a5e0c9bf4f7b223754f/pistis-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "0221ef2f815278d224d2a6d9c2818927", "sha256": "a9e3bcd32d1042a5403bd748849c0c2202a22c31faa1223d7084e5312d52dbf1" }, "downloads": -1, "filename": "pistis-0.3.1.tar.gz", "has_sig": false, "md5_digest": "0221ef2f815278d224d2a6d9c2818927", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25006186, "upload_time": "2018-06-09T15:28:02", "url": "https://files.pythonhosted.org/packages/70/e6/89f0e608bb916e755ff5e1672a38c7a6bf7e63226ff4ae0a4f95a42ed4fd/pistis-0.3.1.tar.gz" } ], "0.3.2": [ { "comment_text": "", "digests": { "md5": "1567b168d5cf6326b571b99f8704b9ba", "sha256": "8577ebd8c906310d36987f2642345343b54188fd3404e72b10e3ca7d7821153e" }, "downloads": -1, "filename": "pistis-0.3.2.tar.gz", "has_sig": false, "md5_digest": "1567b168d5cf6326b571b99f8704b9ba", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25006184, "upload_time": "2018-06-09T15:42:47", "url": "https://files.pythonhosted.org/packages/dc/c4/f56157eab3a7753c923206293fe51eae6601c7c1cb101b9d044c8346c52a/pistis-0.3.2.tar.gz" } ], "0.3.3": [ { "comment_text": "", "digests": { "md5": "4b9a8bf8b00058277fdd88aed0a6b7f9", "sha256": "9f83b1e46dda36a2c5f63237846fb5c5814ba2ef209dd754cf433a4beae01ab5" }, "downloads": -1, "filename": "pistis-0.3.3.tar.gz", "has_sig": false, "md5_digest": "4b9a8bf8b00058277fdd88aed0a6b7f9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25009660, "upload_time": "2018-06-09T15:48:58", "url": "https://files.pythonhosted.org/packages/0b/70/3fb18b36dc24384c1749db969225b27e60edb7aff1e3123a4d45a004f247/pistis-0.3.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4b9a8bf8b00058277fdd88aed0a6b7f9", "sha256": "9f83b1e46dda36a2c5f63237846fb5c5814ba2ef209dd754cf433a4beae01ab5" }, "downloads": -1, "filename": "pistis-0.3.3.tar.gz", "has_sig": false, "md5_digest": "4b9a8bf8b00058277fdd88aed0a6b7f9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25009660, "upload_time": "2018-06-09T15:48:58", "url": "https://files.pythonhosted.org/packages/0b/70/3fb18b36dc24384c1749db969225b27e60edb7aff1e3123a4d45a004f247/pistis-0.3.3.tar.gz" } ] }