{ "info": { "author": "Zhen Liu", "author_email": "liuzhen2018@sibs.ac.cn", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: POSIX :: Linux", "Programming Language :: Python :: 3" ], "description": "# fixalign-project\n\nAn annotation based method to fix small exons missed alignment defects in Nanopore long reads.\n\n\nThe example plot above show the misaligned exon in the middle is find by compaired the reference annotation BED and realigned to the correct position. When we use [minimap2](https://github.com/lh3/minimap2) to align ONT RNA-seq reads, it is easy to miss the exon with small size, because of the difficulty to find an exact match anchor in these region.\n\n### INSTALLATION\n\n`pip install fixalign`\n\nUpgrade to a newer version using:\n`pip install fixalign --upgrade`\n\nThe package is written for python3\n\n### INPUT\n\nfixalign requires four essential arguments.\n- BAM file\n- genome reference fasta (with index fai in the same directory)\n- transcript annotation BED12\n- the target file for the output exon-missed regions information \n\n### OUTPUT\n- regions which have missed small exons\n- realigned BAM file which are fixed \n\n### USAGE\n```\nfixalign [-h] [-s N] [-d float] [-f N] [--ignoreStrand] [--detail]\n [--floatFlankLen] [-o file | --onlyRegion]\n inBam genomeFasta annotBed outRegion\n\npositional arguments:\n inBam Input original bam file.\n genomeFasta Reference genome fasta file, with fai index under the same directory.\n annotBed Annotated transcripts file in BED12 format.\n outRegion Output Region file, regions contain missed small exons.\n\noptional arguments:\n -h, --help show this help message and exit.\n -s, --exonSizeThd N The threshold of exons size, ignore transcript exons with size > exonSizeThd (default: 100).\n We use exon size and delta ratio to judge whether the intron region of each read has missed small exons or not. \n For more explanation, please go to the NOTES.\n -d, --deltaRatioThd float\n The threshold of absolute delta ratio, ignore abs(delta ratio) > deltaRatioThd (default: 0.5).\n Delta ratio = modified margin length / exon size. For more explanation, please go to the NOTES.\n -f, --flankLen N The extended length on the both sides of realign region (default: 20).\n --ignoreStrand Consider both strands (default: False).\n --detail Return all possible missed exons on different transcripts (default: False).\n --floatFlankLen Flank length can be changed by adjacent indel (default: False).\n -o, --outBam file Output realigned bam file (default: None).\n --onlyRegion Only return the Region file without realign process (default: False).\n\n```\n\n### NOTES\n\nThe basic idea for fixalign to check whether the intron regions of reads miss small exons is based on the length compairsion. \n\n- **margin length** As the illustrator show above, *margin length* is defined as the extra length of read exons compaired to the annotation exons (e.g. a, b and m1+m2 is *margin length* correspond to the annotated two transcripts). We find the reads introns if the intron overlaps with the annotated exon, and then check whether the *margin length* close to the *exon length*. It is easy to think that *margin length* should be close to the *exon length* if they come from the annotated exon. But as we find out that these false aligned region (realign region) usually contain many of indels (insertion and deletion) and the *margin length* is smaller than *exon length* in the most case. \n- **flank length** In order to define the range of the realign region. we set the region start = min(annoted splice site, read splice site) - *flank length* and the region end = max(annoted splice site, read splice site) + *flank length*\n- **modified margin length** we count the number of indels in the realign region and define *modified margin length* (*modified margin length* = *margin length* + insertions - deletions) instead of *margin length* to compair with the *exon length*.\n- **delta ratio** *delta ratio* = (*modified margin length* - *exon length*) / *exon length*. We choose *exon length* <= `exonSizeThd` and *delta ratio* <= `deltaRatioThd` to filter out the false positives and record the left regions as exon-missed region. The regions information is output in `outRegion`.\n- `--detail` One Read intron may be overlapped with exons on multiple transcripts. As default, we only return the annotated transcript with minimum *delta raio* as the correct reference. You can set `--detail` to keep all possible transcripts reference. In the realign process, we only keep one realign result among different transcript, the one with highest realignment score will be kept.\n\nWe use [parasail](https://github.com/jeffdaily/parasail-python) to do the global pairwise alignment between read sequence and annotated exon sequence in the realign region. We only keep the realign result if realigned score > original score. \nATTENTION: the original score does not refer to the AS field in BAM if provided. We calculate the realigned score and origial score based on the realignment and original alignment in the realign region, and the score is equal to the matched bases count - edit distance. As different alignment tools may have different score system, we do not change the AS of NM field in BAM if provided.\n- `--onlyRegion` Although we default to return the realigned result in `-o, --outBam file`, you can set the `--onlyRegion` to skip the realign process (although the realign process is not the bottleneck at present).\n- `--ignoreStrand` This argument is used if the original reads is not stranded RNA-seq. We will try both strands to find the overlapped exons.\n\n### EXAMPLE USAGE\n```bash\ngit clone https://github.com/zhenLiuExplr/fixalign-project\ncd examples\nfixalign ex.bam ex.fa ex_annotation.bed ex_realign_region -f 20 --ignoreStrand -o realn.bam\n```\n\n### ACKNOWLEDGMENTS/CONTRIBUTORS\n- Zhen Liu for building and maintance\n- Wu Wei and Chenchen Zhu for advising\n\n### CONTRIBUTING\nWelcome for all suggestions, bug reports, feature request and contributions. You can leave an [issue](https://github.com/zhenLiuExplr/fixalign-project/issues) or open a pull request.\n\n### CITATION\nIf you use this tool, please consider citing our [publication]()\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/zhenLiuExplr/fixalign-project", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "fixalign", "package_url": "https://pypi.org/project/fixalign/", "platform": "", "project_url": "https://pypi.org/project/fixalign/", "project_urls": { "Homepage": "https://github.com/zhenLiuExplr/fixalign-project" }, "release_url": "https://pypi.org/project/fixalign/0.2.3/", "requires_dist": [ "numpy", "pandas", "pysam", "parasail (==1.1.17)" ], "requires_python": ">=3", "summary": "find and fix missed small exons", "version": "0.2.3" }, "last_serial": 5942374, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "33aac8a4e60ee91f3cf8d1b73a5b04c7", "sha256": "64ad32a029f62213af1316518785222aa27157d7e52007d16b6e3113524517dd" }, "downloads": -1, "filename": "fixalign-0.0.1-py3.7.egg", "has_sig": false, "md5_digest": "33aac8a4e60ee91f3cf8d1b73a5b04c7", "packagetype": "bdist_egg", "python_version": "3.7", "requires_python": ">=3", "size": 28449, "upload_time": "2019-08-29T02:56:59", "url": "https://files.pythonhosted.org/packages/46/23/f26f3e9a740348983f7c2bbf8a794e7a104aa05e19d65359b9635ec52327/fixalign-0.0.1-py3.7.egg" }, { "comment_text": "", "digests": { "md5": "627d39a68bbba57b86ac1b8e76b9c945", "sha256": "a5a2655346dcb9b656093c9cdeead6be51e5bb399362f6da318c68bbf411f0f0" }, "downloads": -1, "filename": "fixalign-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "627d39a68bbba57b86ac1b8e76b9c945", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 15421, "upload_time": "2019-08-29T02:56:55", "url": "https://files.pythonhosted.org/packages/2a/cf/4d6fe40548a56b590030b0c8b0eb1c7edf374c64b2a447ee423f3262ff80/fixalign-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "03937fe150c8754bbc440a46bd159568", "sha256": "837470f935da9c9a51d92abcff82ce9f054011a5c308f53ff12bf2e8d79ec111" }, "downloads": -1, "filename": "fixalign-0.0.1.tar.gz", "has_sig": false, "md5_digest": "03937fe150c8754bbc440a46bd159568", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 12966, "upload_time": "2019-08-29T02:57:01", "url": "https://files.pythonhosted.org/packages/1e/a2/d953861094119c2e7b7c37b5258c23bf3e0e67658bdc39515b92eb250fe4/fixalign-0.0.1.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "01ad7724007d332c35f6bb3d8dfce668", "sha256": "db463f5bb203d20efe2355a3a8a3b84a841e2bd8edf69e79981c5debc774f93a" }, "downloads": -1, "filename": "fixalign-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "01ad7724007d332c35f6bb3d8dfce668", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 13648, "upload_time": "2019-08-29T02:56:57", "url": "https://files.pythonhosted.org/packages/d8/65/e69f3bb1498838739b984c2a0d08d3d5bea8d818209c5d0ed70432a1f37b/fixalign-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8816e0898aa29f5bd7ffced6c91c0cfd", "sha256": "0abde246c35309817f2df5b68b30315598eccef96ec2bb5602f1112c888524f8" }, "downloads": -1, "filename": "fixalign-0.1.0.tar.gz", "has_sig": false, "md5_digest": "8816e0898aa29f5bd7ffced6c91c0cfd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11569, "upload_time": "2019-08-29T02:57:03", "url": "https://files.pythonhosted.org/packages/ab/be/8361e505b362d870d44d4110a29a5f493e0cf00b15a29bfd26729f5ba44b/fixalign-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "11ddda648245d5418866a15292eeee67", "sha256": "455d86c31b839ecfcd1eb14f137f05e4af42db8b5b04ed946d64e284991eb9e0" }, "downloads": -1, "filename": "fixalign-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "11ddda648245d5418866a15292eeee67", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 15420, "upload_time": "2019-08-29T03:00:06", "url": "https://files.pythonhosted.org/packages/24/3d/12de03cb1656d0f8b29202a6384a3cc0af7591a2ff10421b2266ef448bb3/fixalign-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3b73482942f9ba6edbded4951af8d88f", "sha256": "3f9c135bbe25173369bb9becb69d04a983a120c0bc18c54b0c87392231110741" }, "downloads": -1, "filename": "fixalign-0.1.1.tar.gz", "has_sig": false, "md5_digest": "3b73482942f9ba6edbded4951af8d88f", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 12958, "upload_time": "2019-08-29T03:00:08", "url": "https://files.pythonhosted.org/packages/37/78/9dec395626035d8f181dec10fcfcf77e87f29c501b0a1961754737f5a03c/fixalign-0.1.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "507deee7954f0835e3cda2f81ea95d17", "sha256": "57315f134228781593aa0c02bf5f79aba7e94ee95f83e503a6c5d07d7d9a69eb" }, "downloads": -1, "filename": "fixalign-0.2.2-py3-none-any.whl", "has_sig": false, "md5_digest": "507deee7954f0835e3cda2f81ea95d17", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 25034, "upload_time": "2019-10-08T00:42:53", "url": "https://files.pythonhosted.org/packages/b6/31/9192f64b0824dc2733b486aa59dd1dde7fd62082868e04db463ca7e6b8bc/fixalign-0.2.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a7b75f69537c684a4077193c951d1787", "sha256": "0d07a032b8e318b3ac820d7380d0236f720bbd1b4f207113e1d9aa389a3bbd44" }, "downloads": -1, "filename": "fixalign-0.2.2.tar.gz", "has_sig": false, "md5_digest": "a7b75f69537c684a4077193c951d1787", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 19248, "upload_time": "2019-10-08T00:42:56", "url": "https://files.pythonhosted.org/packages/ee/62/e81590f092ab444fd72e08af57f394a17a8581dfd3d78e822fd0711e16eb/fixalign-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "7a89b1bdbe492bac342bbe96ca41856d", "sha256": "26cdc1508ed0fc7a64b427d528bb38b0ad50508827bb1a199b5880396876839e" }, "downloads": -1, "filename": "fixalign-0.2.3-py3-none-any.whl", "has_sig": false, "md5_digest": "7a89b1bdbe492bac342bbe96ca41856d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 25074, "upload_time": "2019-10-08T01:28:25", "url": "https://files.pythonhosted.org/packages/b3/9a/3abc9a9670a18e7d7f41ab26bd33e41fc578b881e11147784ccc097b343d/fixalign-0.2.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0fd79049ec366af81f28f3550901bbb8", "sha256": "b13c94a7f838cbc441971f0a322d1e08ef35ab42a8823713d97f27fa2f0df3b1" }, "downloads": -1, "filename": "fixalign-0.2.3.tar.gz", "has_sig": false, "md5_digest": "0fd79049ec366af81f28f3550901bbb8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 19293, "upload_time": "2019-10-08T01:28:30", "url": "https://files.pythonhosted.org/packages/c0/27/5a6e21dd8e56f95e2606b0fdd6c0ef8e2ae8a6140a6c2450d3c61e8528af/fixalign-0.2.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7a89b1bdbe492bac342bbe96ca41856d", "sha256": "26cdc1508ed0fc7a64b427d528bb38b0ad50508827bb1a199b5880396876839e" }, "downloads": -1, "filename": "fixalign-0.2.3-py3-none-any.whl", "has_sig": false, "md5_digest": "7a89b1bdbe492bac342bbe96ca41856d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 25074, "upload_time": "2019-10-08T01:28:25", "url": "https://files.pythonhosted.org/packages/b3/9a/3abc9a9670a18e7d7f41ab26bd33e41fc578b881e11147784ccc097b343d/fixalign-0.2.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0fd79049ec366af81f28f3550901bbb8", "sha256": "b13c94a7f838cbc441971f0a322d1e08ef35ab42a8823713d97f27fa2f0df3b1" }, "downloads": -1, "filename": "fixalign-0.2.3.tar.gz", "has_sig": false, "md5_digest": "0fd79049ec366af81f28f3550901bbb8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 19293, "upload_time": "2019-10-08T01:28:30", "url": "https://files.pythonhosted.org/packages/c0/27/5a6e21dd8e56f95e2606b0fdd6c0ef8e2ae8a6140a6c2450d3c61e8528af/fixalign-0.2.3.tar.gz" } ] }