{ "info": { "author": "T Williams", "author_email": "trw.visions@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# **Aligner**\n\n### What\n**Aligner** is a tool for [FASTA] (https://en.wikipedia\n.org/wiki/FASTA_format) DNA sequence alignment.\n\n\n### How to use\n**Aligner** is fairly simple to use. All you need is a standard FASTA text\nfile (*.txt).\n\nBy the way, included in this package are two example files (fasta_example_1\n.txt,\nfasta_example_2.txt).\n\nIn order to install,\n\n```\npip install sequence_aligner\n```\n\nTo use aligner from the command line, you run the following commands:\n\n```\naligner [-h] --file_path FILE_PATH [--storage_path STORAGE_PATH] [--results_name\nRESULTS_NAME]\n```\n\nFILE PATH is the path to the fasta sequence text file.\n\nWhile FILE PATH is required, STORAGE_PATH and FILE_NAME are optional.\n\nBy default, the result file will be stored in the following generated directory:\n`/tmp/sequence_results`\n\nAnd the file name will be generated with the following pattern:\n`sequence_read-.txt` for example: `sequence_read-2016-07-01T16.42.21.246183.txt`\n\nOnce you run the command line, within seconds, your results will be\ngenerated, and you'll receive a print out of the location and name of your file.\n\nFor example:\n```Sequence Alignment results stored --> /tmp/sequence_results/sequence_read-2016-07-01T17.59.48.458859.txt```\n\n\n### How it works\nIn order to align sequences, first, I created a dictionary of all \nsubsequences from the size 'greater than half' to full length. \n\nOne sequence in the list of sequences was designated the \"anchor \nsequence\". \n\nI gathered that at any given state of the anchor sequence, there \nwould exist a sequence with the the greatest overlap. Iterating \nthrough the sequence list, the sequence with the \nsub-sequence (from the dictionary mentioned above) having the \ngreatest 'score' is identified. The score is based on the amount of \noverlap with the anchor sequence. This subsequence is then merged \ninto the anchor sequence. This iteration continues until all \nsequences in the sequence list have been merged into the anchor \nsequence. \n\n## Caveats/Issues\n* One issue I dealt with was speed. After many iterations of \nrefactoring, I was able to bring runtime down from ~45-50 min to ~10-11\n min. \n* This program assumes that sequences overlap and that there are no \nmutations.\n\n## Next Steps\n* Improve efficiency\n* Improve 'match-iness' algorithm -- right now, my 'score' is merely \nbased on length of overlap at a given state of the anchor sequence.\nMore preprocessing could be done before commencing alignment in order\n to score the level of 'matchiness' between all pairs of sequences.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Tiffany8/Sequence-Aligner", "keywords": null, "license": "UNKNOWN", "maintainer": null, "maintainer_email": null, "name": "SequenceAligner", "package_url": "https://pypi.org/project/SequenceAligner/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/SequenceAligner/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/Tiffany8/Sequence-Aligner" }, "release_url": "https://pypi.org/project/SequenceAligner/0.0.2/", "requires_dist": null, "requires_python": null, "summary": "FASTA sequence aligner - test", "version": "0.0.2" }, "last_serial": 2205482, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "bb57fe5bbd82b7193b875c721c504db4", "sha256": "bbd2cd48d95acb53824ec39fae4e2cba51f398266ba44ad35e318bff073ee310" }, "downloads": -1, "filename": "SequenceAligner-0.0.2.tar.gz", "has_sig": false, "md5_digest": "bb57fe5bbd82b7193b875c721c504db4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5836, "upload_time": "2016-07-06T11:29:44", "url": "https://files.pythonhosted.org/packages/7f/6b/ac0ae3cc75e024629f03e3916a6a336fea8c7dd016fac589d25f761a73e2/SequenceAligner-0.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "bb57fe5bbd82b7193b875c721c504db4", "sha256": "bbd2cd48d95acb53824ec39fae4e2cba51f398266ba44ad35e318bff073ee310" }, "downloads": -1, "filename": "SequenceAligner-0.0.2.tar.gz", "has_sig": false, "md5_digest": "bb57fe5bbd82b7193b875c721c504db4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5836, "upload_time": "2016-07-06T11:29:44", "url": "https://files.pythonhosted.org/packages/7f/6b/ac0ae3cc75e024629f03e3916a6a336fea8c7dd016fac589d25f761a73e2/SequenceAligner-0.0.2.tar.gz" } ] }