{ "info": { "author": "Matthew Shirley", "author_email": "mdshw5@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Environment :: Console", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Natural Language :: English", "Operating System :: Unix", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.2", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: Implementation :: PyPy", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "[![Build Status](https://travis-ci.org/mdshw5/hamstring.svg?branch=master)](https://travis-ci.org/mdshw5/hamstring)\n[![PyPI](https://pypi.python.org/pypi/hamstring)](https://img.shields.io/pypi/v/hamstring.svg?branch=master)\n\nThis python module generates, checks, and corrects quaternary Hamming barcodes. The theory for generating quaternary (DNA) Hamming barcodes comes from the publication [Bystrykh, L. V. (2012). Generalized DNA Barcode Design Based on Hamming Codes. PLoS ONE](http://www.plosone.org/article/info:doi/10.1371/journal.pone.0036852). Currently, the `hamstring` module only works with Hamming7,4 encoding, but may be generalized to other sizes of data and parity bits.\n\n[![Figure1 Bystrykh et al.](redist/figure1.png)](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0036852#pone-0036852-g001)\n\n## Usage\n\n**Generate Hamming DNA barcodes**\n\n generateBarcodes.py [-h] out\n\n arguments:\n out output barcode file name\n -h, --help show this help message and exit\n -p PARITY, --parity PARITY\n length of the parity bit e.g. 4 for Hamming8,4.\n default=3\n\nexample output:\n\n index base4 nucleotide gc\n 0 0000000 AAAAAAA 0.0\n 1 3303001 TTATAAC 0.14\n 2 2202002 GGAGAAG 0.57\n 3 1101003 CCACAAT 0.43\n 4 0303010 ATATACA 0.14\n\n**Checksum a list of Hamming DNA barcodes**\n\n checkBarcodes.py [-h] list\n\n arguments:\n list list of barcodes to check, one per line\n -h, --help show this help message and exit\n -p PARITY, --parity PARITY\n length of the parity bit e.g. 4 for Hamming8,4.\n default=3\n\nexample output:\n\n in\tfixed\tchecksum\n CATAACT\tCAAAACT\tA > T at pos 3\n AGAGAGA\tAGAGAGA\tok\n TCACAGC\tTCACAGC\tok\n GAACAGG\tGAAAAGG\tA > C at pos 4\n CTATAGT\tCTATAGT\tok\n TTTAAAN\tNNNNNNN\tbad\n NNNNNNN\tNNNNNNN\tbad\n\n**Tag fastq reads with a barcode (for generating a simulated dataset)**\n\n tagReads.py [-h] -e nb fastq out\n\n arguments:\n nb number of barcodes to generate\n fastq fastq file to process\n out name for new fastq file\n -e, --erate error rate for single barcode base errors. default=0.05\n -p PARITY, --parity PARITY\n length of the parity bit e.g. 4 for Hamming8,4.\n default=3\n -h, --help show this help message and exit\n\nexample input:\n\n @HWI-EAS179_0001:5:1:7:119#0/1\n CAGGGCGCGAATGNTTTGAGAGGGANATTGGAAANNNNNGATAGANNGGNCTATNNTGNNNNNNNNNNNNNNNNNN\n +\n HIHHHGHHHFDHH#EHHH?HHHDH>#DGGG@7@?##########################################\n\nexample output:\n\n @HWI-EAS179_0001:5:1:7:119#0/1\n CCATGGCCAGGGCGCGAATGNTTTGAGAGGGANATTGGAAANNNNNGATAGANNGGNCTATNNTGNNNNNNNNNNNNNNNNNN\n +\n HHHHHHHHIHHHGHHHFDHH#EHHH?HHHDH>#DGGG@7@?##########################################\n\n**Check and fix barcodes in fastq file**\n\n fixFastq.py [-h] [-s] list fastq out\n\n arguments:\n list list of barcodes used in experiment, one per line\n fastq fastq file to process\n out name for new fastq file\n -s, --strict change all barcodes not in list to 'N'\n -h, --help show this help message and exit\n\nexample input:\n\n @HWI-EAS179_0001:5:1:7:119#0/1\n GCATGGCCAGGGCGCGAATGNTTTGAGAGGGANATTGGAAANNNNNGATAGANNGGNCTATNNTGNNNNNNNNNNNNNNNNNN\n +\n HHHHHHHHIHHHGHHHFDHH#EHHH?HHHDH>#DGGG@7@?##########################################\n\nexample output:\n\n @HWI-EAS179_0001:5:1:7:119#0/1\n CCATGGCCAGGGCGCGAATGNTTTGAGAGGGANATTGGAAANNNNNGATAGANNGGNCTATNNTGNNNNNNNNNNNNNNNNNN\n +\n HHHHHHHHIHHHGHHHFDHH#EHHH?HHHDH>#DGGG@7@?##########################################\n\n## Requirements\n\n- Python 2.7 or Python 3.2+\n\n## Hamstring Module\n\nThe core hamstring module has no external module dependencies and should run under any OS.\n\n`base4Encode(n,d)` is used to convert decimal notation *n* to quaternary notation with *d* leading digits. *example*:\n\n hamstring.base4Encode(22, 4)\n [0, 1, 1, 2]\n\n`generateHamming(data,parity)` is used to generate DNA quaternary Hamming codes from list of quaternary digits *data* with *parity* number of parity bits.\n*example*:\n\n```python\n>>> generateHamming([0,1,1,2], 3)\nBarcode(base4='1100112', nucleotide='CCAACCG', gc=0.71)\n\n>>> generateHamming([0,1,1,2], 4)\nBarcode(base4='11001122', nucleotide='CCAACCGG', gc=0.75)\n```\n\n`decodeHamming(barcode,parity)` is used to decode *barcode* nucleotide Hamming string with *parity* number of parity bits, and perform error correction if needed.\n*example*:\n\n```python\n>>> decodeHamming('CCAACCG', 3)\nCheckedBarcode(nucleotide='CCAACCG', chksum='ok')\n\n>>> decodeHamming('CCAACCGG', 4)\nCheckedBarcode(nucleotide='CCAACCGG', chksum='ok')\n\n>>> decodeHamming('CCATCCG', 3)\nCheckedBarcode(nucleotide='CCAACCG', chksum='A > T at pos 4')\n\n>>> decodeHamming('CCATCCGG', 4)\nCheckedBarcode(nucleotide='CCAACCGG', chksum='A > T at pos 4')\n\n>>> decodeHamming('TCATCCGG', 4)\nCheckedBarcode(nucleotide='NNNNNNNN', chksum='bad')\n```\n\n## Author\n\nMatt Shirley - mdshw5'at'gmail'.'com - [http://mattshirley.com](http://mattshirley.com)", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://mattshirley.com", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "hamstring", "package_url": "https://pypi.org/project/hamstring/", "platform": "", "project_url": "https://pypi.org/project/hamstring/", "project_urls": { "Homepage": "http://mattshirley.com" }, "release_url": "https://pypi.org/project/hamstring/1.1/", "requires_dist": null, "requires_python": "", "summary": "This python module generates, checks, and corrects quaternary Hamming barcodes.", "version": "1.1" }, "last_serial": 3622786, "releases": { "1.1": [ { "comment_text": "", "digests": { "md5": "d5b594b59efb5216374e18981b363939", "sha256": "a66d3f34cd440bbca01b63d540f8cdc606edb733dd7d80f55091377d2254a24c" }, "downloads": -1, "filename": "hamstring-1.1.tar.gz", "has_sig": false, "md5_digest": "d5b594b59efb5216374e18981b363939", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5111, "upload_time": "2018-02-27T19:27:32", "url": "https://files.pythonhosted.org/packages/3a/08/d12b759c95e5dd6f61061410074148d8072bc7539e8b6885b3376022ee73/hamstring-1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d5b594b59efb5216374e18981b363939", "sha256": "a66d3f34cd440bbca01b63d540f8cdc606edb733dd7d80f55091377d2254a24c" }, "downloads": -1, "filename": "hamstring-1.1.tar.gz", "has_sig": false, "md5_digest": "d5b594b59efb5216374e18981b363939", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5111, "upload_time": "2018-02-27T19:27:32", "url": "https://files.pythonhosted.org/packages/3a/08/d12b759c95e5dd6f61061410074148d8072bc7539e8b6885b3376022ee73/hamstring-1.1.tar.gz" } ] }