{ "info": { "author": "Jeremiah H. Savage", "author_email": "jeremiahsavage@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: Console", "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Operating System :: POSIX", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "# gdc-readgroups\n\n[![PyPI version](https://badge.fury.io/py/gdc-readgroups.svg)](https://badge.fury.io/py/gdc-readgroups)\n\n## Purpose\nThis package will extract the Read Group header lines from a BAM file, and convert the contained metadata to a json or tsv file with appropriate values applied for creation of a Read Group node in the [NCI's Genomic Data Commons](https://gdc.cancer.gov/) (GDC). Optionally, it will take no input, and output a template which may be edited to create a submission to the GDC.\n\nThe generated file may contain some fields marked `REQUIRED`, which indicates these fields could not be generated from the supplied BAM file. In this case, the user must apply their own desired values to the generated json. The `` must be as indicated in the generated json file. For details, see the column `Acceptable Types or Values` at the [GDC Data Dictionary Viewer](https://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=read_group).\n\n\n\nOther fields are optional, and are marked `OPTIONAL`. If these fields could not be generated from the supplied BAM file, they may be filled in as appropriate or removed.\n\n#### Note\n\nThe tool will only run on complete BAM files - files which contain the suffix `.bam`.\n\nIf the BAM is truncated, the error\n\n```\n OSError: no BGZF EOF marker; file may be truncated\n```\n\nwill be generated, and no json will be produced.\n\n\n## Installation\nThere are 2 ways to install `gdc-readgroups`\n\n#### pip install from pypi\n`gdc-readgroups` may be used as a `pip` installed python package.\n\nIf you would like to install the package as root, for all users, run\n\n```bash\nsudo pip install gdc-readgroups\n```\n\nIf you would like to install the package only for a local user, run\n\n```bash\npip install gdc-readgroups --user\n```\n\n#### Build a Docker Image\nThe github repository for this package contains a Dockerfile, which may be used to build an image containing the package and all prerequisites. There are two ways to build the image.\n\n1. Using `docker` directly.\n ```bash\n wget https://raw.githubusercontent.com/NCI-GDC/gdc-readgroups/master/Dockerfile\n docker build -t gdc-readgroups .\n ```\n\n1. Using `cwltool` to build an image, and then run it, in one command.\n\n In this case the cwl tool will expect a BAM input, and produce a json output. To install the reference CWL engine, run\n ```bash\n pip install cwltool --user\n ```\n Then to build the `gdc-readgroups` Docker Image and run the Container, run\n\n ```bash\n wget https://raw.githubusercontent.com/NCI-GDC/gdc-readgroups/master/Dockerfile\n wget https://raw.githubusercontent.com/NCI-GDC/gdc-readgroups/master/gdc-readgroups.cwl\n cwltool gdc-readgroups.cwl --INPUT \n ```\n The above command will only build the Docker Image if it does not exist on the system. After the build is performed once, the image will remain on your system, and the next `cwltool` run will skip the build step.\n\n## Usage\n\n`gdc-readgroups` has two main modes: `bam-mode` and `template-mode`. \n\n#### bam-mode\n\nIn `bam-mode`, a path to a BAM file must be supplied as input. By default, `bam-mode` will output a json file, but optionally may output a tsv file.\n\nThe command to run the [pip installed package](#pip-install-from-pypi) is\n\n```bash\ngdc-readgroups bam-mode --bam_path \n```\n\nThe generated json will be placed in the current working directory and have a filename of `.json`.\nAny error messages will be written to stdout.\n\nTo output a tsv file, run\n\n```bash\ngdc-readgroups bam-mode --bam_path --output-format tsv\n```\n\nThe generated tsv file will be placed in your current working directory, and be of the form `.tsv`\n\n\n#### template-mode\n\nIn `template-mode`, no input is supplied, and two empty records are output within one file, either in json or tsv format.\n\nTo generate a json template, run\n\n```bash\ngdc-readgroups template-mode\n```\n\nThe output will be placed in the current working directory and have a filename of `gdc_readgroups.json`\n\nTo generate a tsv template, run\n\n```bash\ngdc-readgroups template-mode --output-format tsv\n```\n\nThe output will be placed in the current working directory and have a filename of `gdc_readgroups.tsv`\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/NCI-GDC/gdc-readgroups/", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "gdc-readgroups", "package_url": "https://pypi.org/project/gdc-readgroups/", "platform": "", "project_url": "https://pypi.org/project/gdc-readgroups/", "project_urls": { "Homepage": "https://github.com/NCI-GDC/gdc-readgroups/" }, "release_url": "https://pypi.org/project/gdc-readgroups/0.4/", "requires_dist": [ "pysam (==0.15.2)", "python-dateutil (==2.8.0)" ], "requires_python": "", "summary": "From a BAM, convert each readgroup to a json/tsv object needed to create a GDC Read Group node.", "version": "0.4" }, "last_serial": 4911541, "releases": { "0.4": [ { "comment_text": "", "digests": { "md5": "aebeaa3c80e25e6f71d6fbe4b2eb3250", "sha256": "85ff8ebe9769937d2d4dfa90256ab2412efaa73e8b226982aa934d5fdd9e37f5" }, "downloads": -1, "filename": "gdc_readgroups-0.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "aebeaa3c80e25e6f71d6fbe4b2eb3250", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 23590, "upload_time": "2019-03-07T16:58:11", "url": "https://files.pythonhosted.org/packages/cf/35/6ffd3d64f18a9b462e0716051e32767703ca7aef290a470866aaf4267171/gdc_readgroups-0.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b3252c98b8daa03ad0fd13dbad520645", "sha256": "2d6af7d07f0211ca54c6743f9546cc852178470bcff577f51257cdf02bb3a3bf" }, "downloads": -1, "filename": "gdc_readgroups-0.4.tar.gz", "has_sig": false, "md5_digest": "b3252c98b8daa03ad0fd13dbad520645", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9733, "upload_time": "2019-03-07T16:58:12", "url": "https://files.pythonhosted.org/packages/ff/4b/e66fa47dbfe467ca787bde8e8c3c127506eba79174527d81e81869f969ce/gdc_readgroups-0.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "aebeaa3c80e25e6f71d6fbe4b2eb3250", "sha256": "85ff8ebe9769937d2d4dfa90256ab2412efaa73e8b226982aa934d5fdd9e37f5" }, "downloads": -1, "filename": "gdc_readgroups-0.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "aebeaa3c80e25e6f71d6fbe4b2eb3250", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 23590, "upload_time": "2019-03-07T16:58:11", "url": "https://files.pythonhosted.org/packages/cf/35/6ffd3d64f18a9b462e0716051e32767703ca7aef290a470866aaf4267171/gdc_readgroups-0.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b3252c98b8daa03ad0fd13dbad520645", "sha256": "2d6af7d07f0211ca54c6743f9546cc852178470bcff577f51257cdf02bb3a3bf" }, "downloads": -1, "filename": "gdc_readgroups-0.4.tar.gz", "has_sig": false, "md5_digest": "b3252c98b8daa03ad0fd13dbad520645", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9733, "upload_time": "2019-03-07T16:58:12", "url": "https://files.pythonhosted.org/packages/ff/4b/e66fa47dbfe467ca787bde8e8c3c127506eba79174527d81e81869f969ce/gdc_readgroups-0.4.tar.gz" } ] }