{ "info": { "author": "Shaurita Hutchins & Robert Gilmore", "author_email": "datasnakes@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Natural Language :: English", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3 :: Only" ], "description": "[![Build Status](https://travis-ci.org/datasnakes/htseq-count-cluster.svg?branch=master)](https://travis-ci.org/datasnakes/htseq-count-cluster)\n\n# htseq-count-cluster\n\nA cli wrapper for running [htseq](https://github.com/simon-anders/htseq)'s `htseq-count` on a cluster.\n\nView [documentation](https://tinyurl.com/yb7kz7zz).\n\n## Install\n\n`pip install HTSeqCountCluster`\n\n## Features\n\n- For use with large datasets (we've previously used a dataset of 120 different human samples)\n- For use with SGE/SGI cluster systems\n- Submits multiple jobs\n- Command line interface/script\n- Merges counts files into one counts table/csv file\n- Uses `accepted_hits.bam` file output of `tophat`\n\n### Examples\n\n#### Run htseq-count-cluster\n\nAfter generating bam output files from tophat, instead of using HTSeq's htseq count, you\ncan use our `htseq-count-cluster` script. This script is intended for use with\nclusters that are using pbs (qsub) for job monitoring.\n\n```bash\nhtseq-count-cluster -p path/to/bam-files/ -f samples.csv -g genes.gtf -o path/to/cluster-output/\n```\n\n| Argument | Description | Required |\n|:--------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:--------:|\n| `-p` | This is the path of your .bam files. Presently, this script looks for a folder that is the sample name and searches for an accepted_hits.bam file (tophat output). | Yes |\n| `-i` | You should have a csv file list of your samples or folder names (no header). | Yes |\n| `-g` | This should be the path to your genes.gtf file. | Yes |\n| `-o` | This should be an existing directory for your output counts files. | Yes |\n| `-e` |\n\nThis script uses logzero so there will be color coded logging information to your shell.\n\nA common linux practice is to use `screen` to create a new shell and run a program\nso that if it does produce output to the stdout/shell, the user can exit that particular\nshell without the program ending and utilize another shell.\n\n##### Help message output for `htseq-count-cluster`\n\n```\nusage: htseq-count-cluster [-h] -p INPATH -f INFILE -g GTF -o OUTPATH\n [-e EMAIL]\n\nThis is a command line wrapper around htseq-count.\n\noptional arguments:\n -h, --help show this help message and exit\n -p INPATH, --inpath INPATH\n Path of your samples/sample folders.\n -f INFILE, --infile INFILE\n Name or path to your input csv file.\n -g GTF, --gtf GTF Name or path to your gtf/gff file.\n -o OUTPATH, --outpath OUTPATH\n Directory of your output counts file. The counts file\n will be named.\n -e EMAIL, --email EMAIL\n Email address to send script completion to.\n\n*Ensure that htseq-count is in your path.\n\n\n```\n\n\n#### Merge output counts files\n\nIn order to prep your data for `DESeq2`, `limma` or `edgeR`, it's best to have 1 merged\ncounts file instead of multiple files produced from the `htseq-count-cluster` script. We offer this\nas a standalone script as it may be useful to keep those files separate.\n\n```bash\nmerge-counts -d path/to/cluster-output/\n```\n\n##### Help message for `merge-counts`\n\n```\nusage: merge-counts [-h] -d DIRECTORY\n\nMerge multiple counts tables into 1 counts .csv file.\n\nYour output file will be named: merged_counts_table.csv\n\noptional arguments:\n -h, --help show this help message and exit\n -d DIRECTORY, --directory DIRECTORY\n Path to folder of counts files.\n```\n\n## ToDo\n\n- [ ] Monitor jobs.\n- [ ] Enhance wrapper input for other use cases.\n- [ ] Add example data.\n\n\n## Maintainers\n\nShaurita Hutchins | [@sdhutchins](https://github.com/sdhutchins) | [\u2709](mailto:sdhutchins@outlook.com)\nRob Gilmore | [@grabear](https://github.com/grabear) | [\u2709](mailto:robgilmore127@gmail.com)\n\n\n## Help\n\nPlease feel free to [open an issue](https://github.com/datasnakes/htseq-count-cluster/issues/new) if you have a question/feedback/problem\nor [submit a pull request](https://github.com/datasnakes/htseq-count-cluster/compare) to add a feature/refactor/etc. to this project.\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/datasnakes/htseq-count-cluster", "keywords": "science lab pyschiatry rnaseq htseq", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "HTSeqCountCluster", "package_url": "https://pypi.org/project/HTSeqCountCluster/", "platform": "", "project_url": "https://pypi.org/project/HTSeqCountCluster/", "project_urls": { "Documentation": "https://tinyurl.com/yb7kz7zz", "Homepage": "https://github.com/datasnakes/htseq-count-cluster" }, "release_url": "https://pypi.org/project/HTSeqCountCluster/1.3/", "requires_dist": [ "HTSeq (>=0.9.1)", "pandas (>=0.20.3)", "logzero (>=1.3.1)" ], "requires_python": "", "summary": "A cli for running multiple pbs/qsub jobs with HTSeq's htseq-count script on a cluster.", "version": "1.3" }, "last_serial": 3870472, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "e2bef23ff97220aec69c45aaa365d44b", "sha256": "1ae3eefad54e3af50cb6ea798968be5b507f4069285599f2d05436004444b68a" }, "downloads": -1, "filename": "HTSeqCountCluster-1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e2bef23ff97220aec69c45aaa365d44b", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 15999, "upload_time": "2018-05-15T21:44:35", "url": "https://files.pythonhosted.org/packages/e7/d8/e7d78c1d154b0ab8d1f1fab38b9ab637c21e1ae80f569eb96e77509db646/HTSeqCountCluster-1.0-py2.py3-none-any.whl" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "33bc13dde5e592e98fb91eea33982e0d", "sha256": "7af70fabb21dcc3e35c5dba11d57e3a33df2e12b71d03fe9bb0d8ef28e3d5cfc" }, "downloads": -1, "filename": "HTSeqCountCluster-1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "33bc13dde5e592e98fb91eea33982e0d", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 12353, "upload_time": "2018-05-15T22:05:56", "url": "https://files.pythonhosted.org/packages/a6/08/3722b2cbdfb610ca84b4cf7fcec3620b404be4b28b2dc0dc87838ce2ac4a/HTSeqCountCluster-1.1-py2.py3-none-any.whl" } ], "1.2": [ { "comment_text": "", "digests": { "md5": "21439204eec568b46b31393ba750a749", "sha256": "869394a68ce1903cb14f7dfcffa1d72e30de4b2a975d01729af00ca2ccbdfbe2" }, "downloads": -1, "filename": "HTSeqCountCluster-1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "21439204eec568b46b31393ba750a749", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 15943, "upload_time": "2018-05-15T22:28:21", "url": "https://files.pythonhosted.org/packages/43/8b/6069c21daca6e460113d3d9c927111f6487e78a7820919a029c005940a05/HTSeqCountCluster-1.2-py2.py3-none-any.whl" } ], "1.3": [ { "comment_text": "", "digests": { "md5": "b62acd8ced77f4583fe6f235ff3ad4ac", "sha256": "b15ec712b2198589f95a1a8f317ebbbc6c3faf4bff2693e735939aa471cb399f" }, "downloads": -1, "filename": "HTSeqCountCluster-1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "b62acd8ced77f4583fe6f235ff3ad4ac", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 16042, "upload_time": "2018-05-16T23:46:20", "url": "https://files.pythonhosted.org/packages/75/29/57823375630962e48560163ca9a466fe6c1d2655487fef72b4ac0d2d02df/HTSeqCountCluster-1.3-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b62acd8ced77f4583fe6f235ff3ad4ac", "sha256": "b15ec712b2198589f95a1a8f317ebbbc6c3faf4bff2693e735939aa471cb399f" }, "downloads": -1, "filename": "HTSeqCountCluster-1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "b62acd8ced77f4583fe6f235ff3ad4ac", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 16042, "upload_time": "2018-05-16T23:46:20", "url": "https://files.pythonhosted.org/packages/75/29/57823375630962e48560163ca9a466fe6c1d2655487fef72b4ac0d2d02df/HTSeqCountCluster-1.3-py3-none-any.whl" } ] }