{ "info": { "author": "Karel Brinda", "author_email": "kbrinda@hsph.harvard.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: Console", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Operating System :: Unix", "Programming Language :: Python :: 3 :: Only", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "SAMsift\n=======\n\n.. image:: https://travis-ci.org/karel-brinda/samsift.svg?branch=master\n :target: https://travis-ci.org/karel-brinda/samsift\n\n.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square\n :target: https://anaconda.org/bioconda/samsift\n\n.. image:: https://badge.fury.io/py/samsift.svg\n :target: https://badge.fury.io/py/samsift\n\n.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.1048211.svg\n :target: https://doi.org/10.5281/zenodo.1048211\n\nSAMsift is a program for advanced filtering and tagging of SAM/BAM alignments\nusing Python expressions.\n\n\nGetting started\n---------------\n\n.. code-block:: bash\n\n # clone this repo and add it to PATH\n git clone http://github.com/karel-brinda/samsift\n cd samsift\n export PATH=$(pwd)/samsift:$PATH\n\n # filtering: keep only alignments with score >94, save them as filtered.bam\n samsift -i tests/test.bam -o filtered.bam -f 'AS>94'\n # filtering: keep only unaligned reads\n samsift -i tests/test.bam -f 'FLAG & 0x04'\n # filtering: keep only aligned reads\n samsift -i tests/test.bam -f 'not(FLAG & 0x04)'\n # filtering: keep only sequences containing ACCAGAGGAT\n samsift -i tests/test.bam -f 'SEQ.find(\"ACCAGAGGAT\")!=-1'\n # filtering: keep only sequences containing A and T only (defined using regular expressions)\n samsift -i tests/test.bam -f 're.match(r\"^[AT]*$\", SEQ)'\n # filtering: sample alignments with 25% rate\n samsift -i tests/test.bam -f 'random.random()<0.25'\n # filtering: sample alignments with 25% rate with a fixed RNG seed\n samsift -i tests/test.bam -f 'random.random()<0.25' -0 'random.seed(42)'\n # filtering: keep only alignments of reads specified in tests/qnames.txt\n samsift -i tests/test.bam -0 'q=open(\"tests/qnames.txt\").read().splitlines()' -f 'QNAME in q'\n # filtering: keep only first 5000 reads from chr1 and 5000 reads from chr2\n samsift -i tests/test.bam -0 'c={\"chr1\":5000,\"chr2\":5000}' -f 'c[RNAME]>0' -c 'c[RNAME]-=1' -m nonstop-remove\n # tagging: add tags 'ln' with sequence length and 'ab' with average base quality\n samsift -i tests/test.bam -c 'ln=len(SEQ);ab=1.0*sum(QUALa)/ln'\n # tagging: add a tag 'ii' with the number of the current alignment\n samsift -i tests/test.bam -0 'i=0' -c 'i+=1;ii=i'\n # updating: removing sequences and base qualities\n samsift -i tests/test.bam -c 'a.query_sequence=\"\"'\n # updating: switching all reads to unaligned\n samsift -i tests/test.bam -c 'a.flag|=0x4;a.reference_start=-1;a.cigarstring=\"\";a.reference_id=-1;a.mapping_quality=0'\n\n\nInstallation\n------------\n\n**Using Bioconda:**\n\n.. code-block:: bash\n\n # add all necessary Bioconda channels\n conda config --add channels defaults\n conda config --add channels conda-forge\n conda config --add channels bioconda\n\n # install samsift\n conda install samsift\n\n\n**Using PIP from PyPI:**\n\n.. code-block:: bash\n\n pip install --upgrade samsift\n\n\n**Using PIP from Github:**\n\n.. code-block:: bash\n\n pip install --upgrade git+https://github.com/karel-brinda/samsift\n\n\nCommand-line parameters\n-----------------------\n\n.. USAGE-BEGIN\n\n.. code-block::\n\n\tProgram: samsift (advanced filtering and tagging of SAM/BAM alignments using Python expressions)\n\tVersion: 0.2.5\n\tAuthor: Karel Brinda \n\n\tUsage: samsift.py [-i FILE] [-o FILE] [-f PY_EXPR] [-c PY_CODE] [-m STR]\n\t [-0 PY_CODE] [-d PY_EXPR] [-t PY_EXPR]\n\n\tBasic options:\n\t -h, --help show this help message and exit\n\t -v, --version show program's version number and exit\n\t -i FILE input SAM/BAM file [-]\n\t -o FILE output SAM/BAM file [-]\n\t -f PY_EXPR filtering expression [True]\n\t -c PY_CODE code to be executed (e.g., assigning new tags) [None]\n\t -m STR mode: strict (stop on first error)\n\t nonstop-keep (keep alignments causing errors)\n\t nonstop-remove (remove alignments causing errors) [strict]\n\n\tAdvanced options:\n\t -0 PY_CODE initialization [None]\n\t -d PY_EXPR debugging expression to print [None]\n\t -t PY_EXPR debugging trigger [True]\n\n\n.. USAGE-END\n\nAlgorithm\n---------\n\n.. code-block:: python\n\n exec(INITIALIZATION)\n for ALIGNMENT in ALIGNMENTS:\n if eval(DEBUG_TRIGER):\n print(eval(DEBUG_EXPR))\n if eval(FILTER):\n exec(CODE)\n print(ALIGNMENT)\n\n\n**Python expressions and code.** All expressions and code should be valid with\nrespect to `Python 3 `_. Expressions are evaluated\nusing the `eval `_\nfunction and code is executed using the `exec\n`_ function.\nInitialization can be used for importing Python modules, setting global\nvariables (e.g., counters) or loading data from disk. Some modules (namely\n``random`` and ``re``) are loaded without an explicit request.\n\n*Example* (printing all alignments):\n\n.. code-block:: bash\n\n samsift -i tests/test.bam -f 'True'\n\n**SAM fields.** Expressions and code can access variables mirroring the fields\nfrom the alignment section of the `SAM specification\n`_, i.e., ``QNAME``, ``FLAG``,\n``RNAME``, ``POS`` (1-based), ``MAPQ``, ``CIGAR``, ``RNEXT``, ``PNEXT``,\n``TLEN``, ``SEQ``, and ``QUAL``. Several additional variables are defined to\nsimply accessing some useful information: ``QUALa`` stores the base qualities\nas an integer array; ``SEQs``, ``QUALs``, ``QUALsa`` skip soft-clipped bases;\nand ``RNAMEi`` and ``RNEXTi`` store the reference ids as integers.\n\n*Example* (keeping only the alignments with leftmost position <= 10000):\n\n.. code-block:: bash\n\n samsift -i tests/test.bam -f 'POS<=10000'\n\n\nSAMsift internally uses the `PySam `_ library and\nthe representation of the current alignment (an instance of the class\n`pysam.AlignedSegment\n`_) is\navailable as a variable ``a``. Therefore, the previous example is equivalent to\n\n.. code-block:: bash\n\n samsift -i tests/test.bam -f 'a.reference_start+1<=10000'\n\n\nThe ``a`` variable can also be used for modifying the current alignment record.\n\n*Example* (removing the sequence and the bases from every record):\n\n.. code-block:: bash\n\n samsift -i tests/test.bam -c 'a.query_sequence=\"\"'\n\n\n**SAM tags.** Every SAM tag is translated to a variable with the same name.\n\n*Example* (removing alignments with a score smaller or equal to the sequence length):\n\n.. code-block:: bash\n\n samsift -i tests/test.bam -f 'AS>len(SEQ)'\n\nIf ``CODE`` is provided, all two-letter variables except ``re`` (the Python regex\nmodule) are back-translated to tags after the code execution.\n\n*Example* (adding a tag ``ab`` carrying the average base quality):\n\n.. code-block:: bash\n\n samsift -i tests/test.bam -c 'ab=1.0*sum(QUALa)/len(QUALa)'\n\n**Errors.** If an error occurs during an evalution of an expression or an\nexecution of a code (e.g., due to accessing an undefined tag), then SAMsift\nbehavior depends on the specified mode (``-m``). With the strict mode (``-m\nstrict``, default), SAMsift will immediately interrupt the computation and\nreport an error. With the ``-m nonstop-keep`` option, SAMsift will continue\nprocessing the alignments while keeping the error-causing alignments in the\noutput. With the ``-m nonstop-remove`` option, all error-causing alignments\nare skipped and ommited from the output.\n\n\nSimilar programs\n----------------\n\n* `samtools view `_ can filter alignments based on FLAGS, read group tags, and CIGAR strings.\n* `sambamba view `_ supports, in addition to SAMtools, a filtration using `simple Perl-like expressions `_. However, it is not possible to use floats or compare different tags.\n* `BamQL `_ provides a simple query language for filtering SAM/BAM files.\n* `bamPals `_ adds tags XB, XE, XP and XL.\n* `SamJavascript `_ can filter alignments using JavaScript expressions.\n* `Picard FilterSamReads `_ can also filter alignments using JavaScript expressions.\n\n\nIssues\n------\n\nPlease use `Github issues `_.\n\n\nChangelog\n---------\n\nSee `Releases `_.\n\n\nLicence\n-------\n\n`MIT `_\n\n\nAuthor\n------\n\n`Karel Brinda `_ \n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/karel-brinda/samsift", "keywords": "NGS SAM alignment", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "samsift", "package_url": "https://pypi.org/project/samsift/", "platform": "", "project_url": "https://pypi.org/project/samsift/", "project_urls": { "Homepage": "https://github.com/karel-brinda/samsift" }, "release_url": "https://pypi.org/project/samsift/0.2.5/", "requires_dist": null, "requires_python": "", "summary": "SAMsift - sift your alignments", "version": "0.2.5" }, "last_serial": 3619041, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "d802fb31c3567ec3d0f8ea4dfce938a8", "sha256": "806670711dcac34d90eda341ef99134dcbd38f6a717aed6a73d94c1cea21bf65" }, "downloads": -1, "filename": "samsift-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "d802fb31c3567ec3d0f8ea4dfce938a8", "packagetype": "bdist_wheel", "python_version": "3.6", "requires_python": null, "size": 7783, "upload_time": "2017-08-31T19:33:50", "url": "https://files.pythonhosted.org/packages/49/74/bfd6db7a8eb094842ae7b91368a7943bf950da6cca27a69e78ba6ef3bbff/samsift-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d61699655e3724652415f890db9c0fbc", "sha256": "2508a030faee574e50fcfb0e2a76fc4ed5c90d2d434e68a63cdf04e1950ee2b9" }, "downloads": -1, "filename": "samsift-0.1.0.tar.gz", "has_sig": false, "md5_digest": "d61699655e3724652415f890db9c0fbc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4891, "upload_time": "2017-08-31T19:33:48", "url": "https://files.pythonhosted.org/packages/bf/88/63dd104a0ca7c3bf9f251bd45a7f9ab75e6db7fdfd0afc31a4f4377d3f17/samsift-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "bad7f853c0bc2ada7b7dbbb3db6389d3", "sha256": "c4917302c76ff95e185755a5fd1dbba7c9a5bb28e334de8e6d01e094564cba62" }, "downloads": -1, "filename": "samsift-0.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "bad7f853c0bc2ada7b7dbbb3db6389d3", "packagetype": "bdist_wheel", "python_version": "3.6", "requires_python": null, "size": 11903, "upload_time": "2017-09-05T21:47:36", "url": "https://files.pythonhosted.org/packages/72/39/1c2e0b71edfd2d67576e6c40e445fc445e384b5a6deba79deed2036b1395/samsift-0.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "05004bd4c0fb03d5239c4643f13217fb", "sha256": "3cc39d4999e9cec9557d4c47919585a90d58b79cd71c762cf98e624cc7d0db44" }, "downloads": -1, "filename": "samsift-0.2.0.tar.gz", "has_sig": false, "md5_digest": "05004bd4c0fb03d5239c4643f13217fb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8630, "upload_time": "2017-09-05T21:47:34", "url": "https://files.pythonhosted.org/packages/ce/e2/8952ac72ffc66b571a87b7e3c8e2763be0ce642019c7b5c9dccd0297d0ae/samsift-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "2aed9f2c280429526589057711b96d32", "sha256": "5cd72b3c1f7f667180717e4faf3ef48ccc039e3f95216963d2c4780eb6f73302" }, "downloads": -1, "filename": "samsift-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "2aed9f2c280429526589057711b96d32", "packagetype": "bdist_wheel", "python_version": "3.6", "requires_python": null, "size": 13653, "upload_time": "2017-09-12T19:48:35", "url": "https://files.pythonhosted.org/packages/ca/27/29c8963867910587de98565801a7da0523ea9cb9559de8e3291c992a5fb1/samsift-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d69683f9ee097bfd3bb5da8f780af0c7", "sha256": "aa30f27d5dc8258e0bfd3eff1fcf6b290815865028ce8fce45e0c216f793a887" }, "downloads": -1, "filename": "samsift-0.2.1.tar.gz", "has_sig": false, "md5_digest": "d69683f9ee097bfd3bb5da8f780af0c7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9650, "upload_time": "2017-09-12T19:48:31", "url": "https://files.pythonhosted.org/packages/69/03/6a3e4883b9b772e7dbca29bfd73cadaa0f05ec9c2c60df90c774dbaff780/samsift-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "dd5bca55b63e58a74c33f02eb94a9b44", "sha256": "52bda69f13c33c2441833fde7b317ecc62caac8990da7eb20aa3071dd926d3b1" }, "downloads": -1, "filename": "samsift-0.2.2.tar.gz", "has_sig": false, "md5_digest": "dd5bca55b63e58a74c33f02eb94a9b44", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11746, "upload_time": "2017-10-11T03:01:37", "url": "https://files.pythonhosted.org/packages/62/66/d770998b0ecd699577daa3940d48e5711941bbbabdadfa9250aa1c601fd4/samsift-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "42bcf5618a3093c1549bf1b9b3d93af8", "sha256": "1e7f83917a3dd69372c61e79846a78878ad88434d4fdf8e606263431ba890021" }, "downloads": -1, "filename": "samsift-0.2.3-py3-none-any.whl", "has_sig": false, "md5_digest": "42bcf5618a3093c1549bf1b9b3d93af8", "packagetype": "bdist_wheel", "python_version": "3.6", "requires_python": null, "size": 14338, "upload_time": "2017-11-13T15:47:03", "url": "https://files.pythonhosted.org/packages/62/62/386720e4b21c32320f3093fb70f8d8f6d27a6dc45a42fec6e83801360259/samsift-0.2.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "45e54dfb5f2a1d70a1189f91151a3d97", "sha256": "893617fda0d4be2c2339848f2f7e9d93c4f45e98ace5729a796eb9e29c1d3d31" }, "downloads": -1, "filename": "samsift-0.2.3.tar.gz", "has_sig": false, "md5_digest": "45e54dfb5f2a1d70a1189f91151a3d97", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11098, "upload_time": "2017-11-13T15:47:02", "url": "https://files.pythonhosted.org/packages/c1/f9/f58528a710015e46c76d70f2ddbf07f9c8bbe97e53da8c8164868f7982b1/samsift-0.2.3.tar.gz" } ], "0.2.5": [ { "comment_text": "", "digests": { "md5": "97b0d7edfbf6e395cde9e943690c4fcb", "sha256": "b526d5046557a357751ce659fa78984df8aeef617490fbbb423dafa24b54d102" }, "downloads": -1, "filename": "samsift-0.2.5-py3-none-any.whl", "has_sig": false, "md5_digest": "97b0d7edfbf6e395cde9e943690c4fcb", "packagetype": "bdist_wheel", "python_version": "3.6", "requires_python": null, "size": 15013, "upload_time": "2018-02-26T22:56:21", "url": "https://files.pythonhosted.org/packages/e9/13/8bd6c7be4a08785fc1857a661fd9fa877e53da31551c33e8c90d6250bddb/samsift-0.2.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b39e5f3657b086da41d9a0d79739f37c", "sha256": "1b02ec272201c0faf349ffbcc422fc1250f23f4f6a5563c1104cd37755044064" }, "downloads": -1, "filename": "samsift-0.2.5.tar.gz", "has_sig": false, "md5_digest": "b39e5f3657b086da41d9a0d79739f37c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11757, "upload_time": "2018-02-26T22:56:19", "url": "https://files.pythonhosted.org/packages/d5/ee/d2f5e4d4f6ccfdd1c00a866b9f05bfc839da5cf999819401f96a310e71c7/samsift-0.2.5.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "97b0d7edfbf6e395cde9e943690c4fcb", "sha256": "b526d5046557a357751ce659fa78984df8aeef617490fbbb423dafa24b54d102" }, "downloads": -1, "filename": "samsift-0.2.5-py3-none-any.whl", "has_sig": false, "md5_digest": "97b0d7edfbf6e395cde9e943690c4fcb", "packagetype": "bdist_wheel", "python_version": "3.6", "requires_python": null, "size": 15013, "upload_time": "2018-02-26T22:56:21", "url": "https://files.pythonhosted.org/packages/e9/13/8bd6c7be4a08785fc1857a661fd9fa877e53da31551c33e8c90d6250bddb/samsift-0.2.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b39e5f3657b086da41d9a0d79739f37c", "sha256": "1b02ec272201c0faf349ffbcc422fc1250f23f4f6a5563c1104cd37755044064" }, "downloads": -1, "filename": "samsift-0.2.5.tar.gz", "has_sig": false, "md5_digest": "b39e5f3657b086da41d9a0d79739f37c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11757, "upload_time": "2018-02-26T22:56:19", "url": "https://files.pythonhosted.org/packages/d5/ee/d2f5e4d4f6ccfdd1c00a866b9f05bfc839da5cf999819401f96a310e71c7/samsift-0.2.5.tar.gz" } ] }