{ "info": { "author": "Johannes K\u00f6ster", "author_email": "johannes.koester@tu-dortmund.de", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Environment :: Console", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "==============================================\r\nALPACA - The ALgebraic PArallel Variant CAller\r\n==============================================\r\n\r\nALPACA is a single nucleotide variant caller for next-generation sequencing\r\ndata, providing intuitive control over the false discovery rate with generic\r\nsample filtering scenarios, leveraging OpenCL on CPU, GPU or any coprocessor to\r\nspeed up calculations and an using HDF5 based persistent storage for iterative\r\nrefinement of analyses within seconds.\r\n\r\nOften, variant calling entails filtering different samples\r\nagainst each other, e.g. disease samples vs. healthy samples, tumor vs. normal or\r\nchildren vs. parents.\r\nThe filtering can be seen as operations over the set algebra of variant loci.\r\nIn general, the filtering is applied after calling.\r\nThis results in the null hypothesis considered by the variant caller to not\r\nproperly reflect the actual research question, which in fact entails the filtering.\r\nIn consequence, controlling the false discovery rate becomes difficult.\r\n\r\nUnlike other state of the art variant callers,\r\n**ALPACA integrates the filtering into the calling**\r\nby introducing a novel algebraic variant calling model.\r\nWhen calling, a filtering scenario can be specified with an algebraic expression\r\nlike A - (B + C) with A, B and C being samples. Algebraic calling allows ALPACA\r\nto report **posterior probabilities** for a variant to occur in the\r\n**unknown set of true variant loci**\r\nin A that are not in B or C here. Since the probabilities reflect the filtering,\r\nthey can be directly used to **intuitively control the false discovery rate**.\r\n\r\nALPACA splits variant calling into a preprocessing\r\nstep and the actual calling. Preprocessed samples are stored in HDF5 index data\r\nstructures. In a lightweight and massively parallel step, the sample indexes are merged\r\ninto an optimized index. On the optimized index, **variant calling becomes a matter of seconds**.\r\nUpon the addition of a sample, merging and the calling have to be repeated.\r\nThe sample indexes of the other samples remain untouched, **avoiding redundant computations**.\r\n\r\nAlgorithmic and mathematical details will be described in my thesis:\r\n\r\n Parallelization, Scalability and Reproducibility in Next-Generation Sequencing Analysis,\r\n Johannes K\u00f6ster, 2015 (work in progress)\r\n\r\n\r\nPrerequisites\r\n-------------\r\n\r\nALPACA needs\r\n\r\n* Linux\r\n* Python >= 3.3\r\n* Numpy >= 1.7\r\n* PyOpenCL >= 2013.1\r\n* h5py >= 1.8.4\r\n* samtools >= 1.0\r\n* mawk\r\n* a working OpenCL device (CPU, GPU, a coprocessor like Intel Xeon Phi or an FPGA)\r\n\r\nPython 3 should be installed on most systems.\r\nYou can make Debian and Ubuntu ready for installing ALPACA by issueing::\r\n\r\n $ sudo apt-get install python3-setuptools python3-numpy python3-h5py samtools mawk\r\n\r\nWithout admin rights, we recommend to use a userspace Python 3 distribution like\r\nhttps://store.continuum.io/cshop/anaconda.\r\n\r\nIf you want to use ALPACA on the GPU, a decent NVIDIA or AMD GPU with the proprietary\r\ndrivers installed should be enough. On Ubuntu and Debian, you can install them\r\nvia::\r\n\r\n $ sudo apt-get install nvidia-current\r\n\r\nor::\r\n\r\n $ sudo apt-get install fglrx\r\n\r\nTo use ALPACA with the CPU, you need an OpenCL runtime installed.\r\nYou can e.g. install the AMD APP SDK (which will work on any x86 CPU) from here:\r\nhttp://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk\r\n\r\n\r\nInstallation\r\n------------\r\n\r\nOnce the prerequisites are in place, ALPACA can be installed and updated with::\r\n\r\n $ easy_install3 --user -U alpaca\r\n\r\n\r\nUsage\r\n-----\r\n\r\nUsage of ALPACA consists of three major steps.\r\n\r\n* sample indexing\r\n* index merging\r\n* calling\r\n\r\nGiven mapped reads for a sample *A* in BAM format and a reference genome in FASTA format,\r\na sample index can be created with::\r\n\r\n $ alpaca index reference.fasta A.bam A.hdf5\r\n\r\nHere, various parameters like the expected ploidy of the sample can be adjusted.\r\nThe resulting index *A.hdf5* will be much smaller than the BAM file.\r\nMerging indexes for samples *A* and *B* is achieved with::\r\n\r\n $ alpaca merge A.hdf5 B.hdf5 all.hdf5\r\n\r\nFinally, calling can be performed on the merged index.\r\nALPACA allows to specify query expressions at the command line by representing\r\nthe union operator with a plus sign and the difference operator with a minus sign.\r\nThe variant calls are streamed out in VCF::\r\n\r\n $ alpaca call --fdr 0.05 all.hdf5 A-B > calls.vcf\r\n\r\nHere, we limit the desired rate of false discoveries to 5%.\r\nTo assess the biological importance of a variant, it is useful to annotate it with additional information like the gene it may be contained in, its effect on a protein that is encoded by the gene or whether it is already known and maybe associated to some disease.\r\nALPACA can annotate a VCF file with such information, using the Ensembl Variant Effect Predictor web service.\r\nSince the VCF format is rather technical, ALPACA can compose a human readable HTML file summarizing the calls.\r\nWe can combine the two commands using Unix pipes::\r\n\r\n $ alpaca annotate < calls.vcf | alpaca show > calls.html\r\n\r\nFor further information on various parameters of all steps (e.g. how to select\r\nthe compute device) can be obtained with::\r\n\r\n $ alpaca --help\r\n\r\n\r\nNews\r\n----\r\n\r\n=========== ========================================================================\r\n6 Feb 2014 Release 0.3.3 of ALPACA. Fixed mixed up annotations with annotate\r\n subcommand. Added column filters to HTML output.\r\n----------- ------------------------------------------------------------------------\r\n13 Jan 2014 Release 0.3.2 of ALPACA. Fixed imprecision in strand bias results.\r\n Further, this release introduces the k-relaxed intersection operator.\r\n A locus is contained in the k-relaxed intersection of a given set of\r\n samples if and only if it is variant in at least k samples.\r\n----------- ------------------------------------------------------------------------\r\n2 Dez 2014 Release 0.2.4 of ALPACA. Further improved HTML output of alpaca show.\r\n----------- ------------------------------------------------------------------------\r\n1 Dez 2014 Release 0.2.3 of ALPACA. Improved HTML output of alpaca show.\r\n----------- ------------------------------------------------------------------------\r\n30 Nov 2014 Release 0.2.2 of ALPACA. This initial release provides all functionality\r\n descibed in my thesis \"Parallelization, Scalability and Reproducibility\r\n in Next-Generation Sequencing Analysis\".\r\n=========== ========================================================================\r\n", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://alpaca.readthedocs.org", "keywords": null, "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "alpaca-variant-caller", "package_url": "https://pypi.org/project/alpaca-variant-caller/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/alpaca-variant-caller/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://alpaca.readthedocs.org" }, "release_url": "https://pypi.org/project/alpaca-variant-caller/0.3.3/", "requires_dist": null, "requires_python": null, "summary": "An algebraic parallel SNV caller using OpenCL", "version": "0.3.3" }, "last_serial": 1412464, "releases": { "0.2": [ { "comment_text": "", "digests": { "md5": "4553a5612dc7729836b44717f6e9d931", "sha256": "34bcb106a49866ff25a602587d7ed023856843364f1ef03c35c43d702d82261b" }, "downloads": -1, "filename": "alpaca-variant-caller-0.2.tar.gz", "has_sig": false, "md5_digest": "4553a5612dc7729836b44717f6e9d931", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 42906, "upload_time": "2014-11-30T14:52:19", "url": "https://files.pythonhosted.org/packages/e7/81/f08a0a7c3a16561f2127d352b756f33bcca1dc42a744bd36e2f1ed81f999/alpaca-variant-caller-0.2.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "eab1a72bff8e356c7444e01170f6ddbf", "sha256": "0225f0265eb1e7f1c97473bf052e19e97d6cbbc43131f2d34c976644fa4727e5" }, "downloads": -1, "filename": "alpaca-variant-caller-0.2.1.tar.gz", "has_sig": false, "md5_digest": "eab1a72bff8e356c7444e01170f6ddbf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 42902, "upload_time": "2014-11-30T14:54:12", "url": "https://files.pythonhosted.org/packages/e9/e7/5309fcfaaa23fd50df42681a5cec40f956d11cdc13581a46e064d294b17a/alpaca-variant-caller-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "d8ec6d9df168c5d4782ad5821e92229e", "sha256": "308b6f90b2b86f28931ddfa30bd8907fe509f354b08ea81de202c3ff1a7fa61e" }, "downloads": -1, "filename": "alpaca-variant-caller-0.2.2.tar.gz", "has_sig": false, "md5_digest": "d8ec6d9df168c5d4782ad5821e92229e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 42898, "upload_time": "2014-11-30T14:57:55", "url": "https://files.pythonhosted.org/packages/7a/bf/09c382644c55f8c086123917a49817d7c88e3faa036f3d3b06840a68a114/alpaca-variant-caller-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "5be15e28a6f43f6d5c62c143ff4d15c1", "sha256": "f6f3d4549a0da7b4da91992626cd32a9977c8960df6c38b272a324161fe1678b" }, "downloads": -1, "filename": "alpaca-variant-caller-0.2.3.tar.gz", "has_sig": false, "md5_digest": "5be15e28a6f43f6d5c62c143ff4d15c1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 43272, "upload_time": "2014-12-01T08:06:19", "url": "https://files.pythonhosted.org/packages/a0/e7/e2d82f712cfad8ec439af51abf031ca6acb05ab6d6b6a86b83cc3917c28c/alpaca-variant-caller-0.2.3.tar.gz" } ], "0.2.4": [ { "comment_text": "", "digests": { "md5": "93d0fa0b2205a4bb7814fb5b67200572", "sha256": "94333736f6f9a8332e3c65da3e69119a8ac566d461b0ac96437e774c549014b2" }, "downloads": -1, "filename": "alpaca-variant-caller-0.2.4.tar.gz", "has_sig": false, "md5_digest": "93d0fa0b2205a4bb7814fb5b67200572", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 42263, "upload_time": "2014-12-02T09:02:48", "url": "https://files.pythonhosted.org/packages/6b/80/bb62c7ab95d7b7b3fae07ea185619abd3295ee1f9665868e54b85e7c3755/alpaca-variant-caller-0.2.4.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "61c3222c785b8e9a5a2d13f85c083f7e", "sha256": "4f5b05d9349f36d815ef21319f254516c838a2f1c515a17cb91a186143002fe0" }, "downloads": -1, "filename": "alpaca-variant-caller-0.3.tar.gz", "has_sig": false, "md5_digest": "61c3222c785b8e9a5a2d13f85c083f7e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 41898, "upload_time": "2015-01-13T15:06:41", "url": "https://files.pythonhosted.org/packages/db/5d/f53ea3988a851ee96dfc23cc12f8004b8a09dec16ea8084c9bf33c8cba9b/alpaca-variant-caller-0.3.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "7595cb4dd7f1b45fc416058c727d29ba", "sha256": "95b5a32b7cb930d6c44de3589757c88cb5fc1e5ee5d4cf61b5299d25f21a491b" }, "downloads": -1, "filename": "alpaca-variant-caller-0.3.1.tar.gz", "has_sig": false, "md5_digest": "7595cb4dd7f1b45fc416058c727d29ba", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 42366, "upload_time": "2015-01-13T15:31:12", "url": "https://files.pythonhosted.org/packages/9b/88/52ae18376cefe617967525c517d7380c5a9d76770cb54bd0ec67f75b8deb/alpaca-variant-caller-0.3.1.tar.gz" } ], "0.3.3": [ { "comment_text": "", "digests": { "md5": "69431e82b14f2908885564cd46a43bb6", "sha256": "1f8abac8a1581cf718dec8f76b7566895207e17bde4a8817c900247d3cd6445c" }, "downloads": -1, "filename": "alpaca-variant-caller-0.3.3.tar.gz", "has_sig": false, "md5_digest": "69431e82b14f2908885564cd46a43bb6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 43286, "upload_time": "2015-02-06T11:03:02", "url": "https://files.pythonhosted.org/packages/4d/26/9ca7b22fb4ea52dda9a38609c123431114b938f30297d583b0b5640a22ed/alpaca-variant-caller-0.3.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "69431e82b14f2908885564cd46a43bb6", "sha256": "1f8abac8a1581cf718dec8f76b7566895207e17bde4a8817c900247d3cd6445c" }, "downloads": -1, "filename": "alpaca-variant-caller-0.3.3.tar.gz", "has_sig": false, "md5_digest": "69431e82b14f2908885564cd46a43bb6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 43286, "upload_time": "2015-02-06T11:03:02", "url": "https://files.pythonhosted.org/packages/4d/26/9ca7b22fb4ea52dda9a38609c123431114b938f30297d583b0b5640a22ed/alpaca-variant-caller-0.3.3.tar.gz" } ] }