{ "info": { "author": "Cody Watson and Sandy Weng", "author_email": "watsonca@email.wofford.edu", "bugtrack_url": null, "classifiers": [], "description": "===========\r\nPyMethyl\r\n===========\r\n\r\nPyMethyl is a package that is a quick and dirty way to find methylation\r\npatterns within the human genome. It is useful in determining the methylation\r\npatterns for promoter regions and transcribable regions for different genes. \r\nThe package has a foundation on the package cruzdb, pysam, and the local BLAST\r\nalignment tool via NCBI. These packages are used to gather the information about\r\nthe genes you want to analyze. Then this package sorts out the bam files of the \r\ngenome to look for methylation. All functions needed are accessible by typing\r\n\"from pymethyl.functions import *\". Some functions are able to be customized \r\nfor peak calling and mapping quality. An example may look like this.\r\n\r\nAn example may look like this::\r\n\r\n #!/usr/bin/env python\r\n\r\n>>> from pymethyl.functions import *\r\n>>> info = regionInfo()\r\nWhat is the name of the gene you are interested in?: NUF2\r\nWhat is the reference genome you are searching? Example: hg18 : hg18\r\nDo you want to search the Promoter Region? (Y/N): Y\r\nHere is the promoter region data\r\n>>> file_handle = open(\"NUF2_Promoter.txt\", \"w\")\r\n>>> file_handle.write(info)\r\n>>> file_handle.close()\r\n>>> makeDatabase()\r\nChoose the text file to make into a database\r\nName of your Database: NUF2\r\nNCBI Database type, leave empty if unsure: nucl\r\nWhich database will you get the sequences from Example: hg18 : hg18\r\nYour database is processing, not to worry, this can take some time.\r\nDone\r\n>>> findMethylation()\r\nChose the data file which holds results from bam file. This should be a text file.\r\nType the gene name you are looking at. Example: 'ACACA' : NUF2\r\nWhat is the mapping quality needed for a read to be significant. Enter 0 for every read: 30\r\nType the amount of coverage you would like have a sequence considered methylated. Example: 5 (which means regions have 5 overlapping sequences are saved) : 5\r\nDone\r\nYour results are in your desktop folder named Methylated Results\r\n>>> blastIt()\r\nEnter the type of BLAST that you want to perform. Types of BLAST can be found on the NCBI website. Example: blastn : blastn\r\nName of the database you are searching. Would be the database you made with the MakeDatabase() function. Example: Sequences : NUF2\r\nWent through 1 data files\r\nCounted 1 events\r\n>>> getResults()\r\nType the methylated gene you are looking at: NUF2\r\nThere are 1 events of methylation within this region.\r\n\r\nfunctions\r\n=========\r\n\r\n\t* Functions is the only module that needs to be imported. This is because Functions is the GUI for all of the other modules.\r\n\t Realize that this process has to be done in order. This documentation will walk through what each function does and how\r\n\t to maneuver through the interface in order to get your methylated results. We will start with getting the genetic information.\r\n\r\n\t \r\nregionInfo()\r\n----------------------------------\r\n\"\"\"This function is based off of the cruzdb package. This is a more friendly user interface to gather information that the cruzdb\r\npackage already gets. This function looks at the transcribable region or the promoter region of the named gene. The idea is to put \r\nin your gene name, the database you would want to search that gene in, and the region that you are interested in looking at.\r\nFor example we are going to pick the gene NUF2 and database hg18. You can find different databases on the UCSC genomic browser website. \r\nThe databases are under assembly.\r\n\"\"\"\r\n\r\n>>> regionInfo()\r\nWhat is the name of the gene you are interested in?: NUF2\r\nWhat is the reference genome you are searching? Example: hg18 : hg18\r\nDo you want to search the Promoter Region? (Y/N): Y\r\nHere is the promoter region data\r\n'NUF2\\t1\\t161556346\\t161557346'\r\n\r\nThis is the output for the regionInfo function. It tells you the gene name, chromosome, starting position, and ending position.\r\nof the transcribable region. This information is necessary to make a database as well as extract information from the bam file. Lets \r\nsave this information to a text file called NUF2_Promoter.txt on our desktop. It is important that the files are text files because that's\r\nwhat this program is compatible with.\r\n\r\n\r\nmakeDatabase()\r\n----------------------------------\r\n\"\"\"This function is based off the cruzdb package. This is a more user friendly interface to gather information that can be transformed\r\ninto a working database for the local BLAST tool. The result of this function is a blast-able database which will be stored in the blast\r\nfolder created on your desktop. What you name your database is important so make sure to remember it.\"\"\"\r\n\r\n>>> makeDatabase()\r\nChoose the text file to make into a database \r\n##At this point in the function, python will open a root window for you to choose a file. This file is going to be our NUF2_Promoter.txt or \r\n##that we created via the regionInfo() function. When you find this file in the GUI window, double click it.\r\nName of your Database: NUF2\r\n##Now choose the name for your database. I chose Promoters but you can pick anything.\r\nNCBI Database type, leave empty if unsure: nucl\r\n##Now choose the type of database you intend to create. I chose nucl because that signifies a nucleotide database. The options for this\r\n##can be found on the NCBI website. If left empty, nucl is automatically chosen.\r\n##\r\nWhich database will you get the sequences from Example: hg18 : hg18\r\n##Now choose the reference database from UCSC that you would like to search. Options for this can be found via the UCSC website.\r\n##.\r\nYour database is processing, not to worry, this can take some time.\r\nDone\r\n##Your database can be found in the blast folder you downloaded from NCBI and it is in the db subdirectory.\r\n\r\n\r\nextractBamFile()\r\n----------------------------------\r\n\"\"\"This function takes the input of a file created by regionInfo. The format of the input file must be gene name \\t chromosome \\t start\r\nposition \\t end position \\n. This function also takes an input file of a sorted bam file with it's bai file in the same directory. The\r\nfunction also takes the sample name which will be the name of the directory created and the read lengths. For methylation data this can\r\nbe reads of 50bp - 250bp. With that user input, the program will create a directory with a file named for the gene that it represents.\r\nThe files inside of the directory contain all of the reads and format them as start position \\t end position \\t mapping quality \\t sequence.\r\nThis file is the input file for findMethylation function.\r\n\r\n>>> extractBamFile()\r\nChoose the data file which holds the information of the gene that you are looking for. Format gene name, chromosome, start position, end position.\r\n##Here you would choose the input file that is formatted like described above.\r\nChoose the BAM file that you are extracting reads from.\r\n##Choose the BAM file but make sure that the bai file is in the same directory as the bam file\r\nWhat is the name of the sample the reads are coming from: Sample1\r\n##This can be named anthing. It is just what the directory will be called that shows up on your desktop\r\nWhat is the length of each read's sequence: 50\r\n##The reads that are contained within the bam file have a length and they are usually 50-250bp long. This number is necessary as the \r\n##BAM file does not contain information about the end position but it is found using the read length.\r\n\r\n\r\n\r\nfindMethylation()\r\n----------------------------------\r\n\"\"\"This function is original to this package and doesn't depend on any other packages. It is a quick and dirty algorithm that finds\r\nthe places of hypermethylation. It can be customized depending on what you decide is significant. It works by taking all of the sequences\r\nin the information file that you give it and finds areas where there is X (a number you choose) number of overlapping sequences. It then\r\nprepares these regions and sequences in a file that the blastIt function can blast for statistical significance.\"\"\"\r\n\r\n>>> findMethylation()\r\nChose the data file which holds results from bam file. This should be a text file.\r\n##A GUI window should appear allowing you to choose a file. This file should be the text file that resulted in running the function that\r\n##extracts information out of the bam file.\r\nType the gene name you are looking at. Example: 'ACACA' : NUF2\r\n##Self Explanatory. Just enter in the gene you are looking at.\r\nWhat is the mapping quality needed for a read to be significant. Enter 0 for every read: 30\r\n##Each read in a bam file has a mapping quality. You can decide to make this higher or lower. The higher the mapping quality, the better\r\n##quality of data you will have. However, the higher the mapping quality, the more sequences are rejected and the amount of data decreases.\r\nType the amount of coverage you would like have a sequence considered methylated. Example: 5 (which means regions have 5 overlapping \r\nsequences are saved) : 5\r\n##This is where the user determines the amount of overlap or peak coverage that they would like in order to warrant a significant \r\n##methylation event. The program will then only find regions where that amount of coverage was found.\r\nDone\r\nYour results are in your desktop folder named Methylated Results\r\n##If you see the message below, it is because your mapping quality stipulation was to high, your amount of coverage was too high\r\n##or there are not enough sequences for a significant methylation event to of been counted. \r\nNot Enough Reads To Warrant Significant Methylation\r\n##If there simply is no methylation within your samples then you will get the message\r\nNo Methylated Detected, File Removed\r\n##If your input file contains nothing, you will get the message\r\nFile contained no sequences\r\n\r\n\r\nblastIt()\r\n----------------------------------\r\n\"\"\"This function takes the output files from the findMethylation function and blast the sequence against the locally made database to\r\nensure that the formulated sequence actually belongs to a gene. We make up the database by the makeDatabase function. The output of this\r\nfunction is a file for every single methylated sequence. The files contain an alignment of the sequence to the reference gene, as well\r\nas the E-value signifying how likely that sequence belongs to that gene.\"\"\"\r\n\r\n>>>blastIt()\r\nEnter the type of BLAST that you want to perform. Types of BLAST can be found on the NCBI website. Example: blastn : blastn\r\n##This feature allows the user to do a blast for nucleotides, other options are blastp etc. Options can be found on the NCBI website.\r\nName of the database you are searching. Would be the database you made with the makeDatabase() function. Example: Sequences : NUF2\r\n##Remember this was the name of the database that we created using makeDatabase function.\r\nWent through 1 data files\r\n##Tells you how many files were looked at\r\nCounted 1 events\r\n##Tells you how many events of methylation occurred in all those files.\r\n\r\n\r\n*****Some users may find it necessary to completely change the command for blasting. To do this, you must go into the source code and change it.*****\r\n\r\n\r\ngetResults()\r\n----------------------------------\r\n\"\"\"This function counts the number of methylated events occurring in the region. It is nothing more than a counter function. It does not\r\nactually do anything to the data.\"\"\"\r\n>>> getResults()\r\n##It is important you spell the gene correct and exactly the way it appears.\r\nType the methylated gene you are looking at: NUF2\r\nThere are 1 events of methylation within this region.\r\n\r\n\r\n\r\nTests\r\n=========\r\n\"\"\"The tests are located in the tests folder. The require interaction from the user.\"\"\"\r\n\r\nTHIS IS A RUN OF THE TEST\r\n\r\nWhen the program asks you to choose a file. Pick the NUF.txt file that will appear on the screen.\r\n\r\n>>> \r\ntest_1 (__main__.TestSequenceFunctions) ... What is the name of the gene you are interested in?: NUF2\r\nWhat is the reference genome you are searching? Example: hg18 : hg18\r\nDo you want to search the Promoter Region? (Y/N): Y\r\nHere is the promoter region data\r\nok\r\ntest_2 (__main__.TestSequenceFunctions) ... Choose the text file to make into a database\r\nName of your Database: NUF2\r\nNCBI Database type, leave empty if unsure: nucl\r\nWhich database will you get the sequences from Example: hg18 : hg18\r\nYour database is processing, not to worry, this can take some time.\r\nDone\r\nok\r\ntest_3 (__main__.TestSequenceFunctions) ... Choose the data file which holds the information of the gene \\\r\nthat you are looking for. Format gene name, chromosome, start position, end position.\r\nChoose the BAM file that you are extracting reads from.\r\nWhat is the name of the sample the reads are coming from: Sample1\r\nWhat is the length of each read's sequence: 50\r\nDone\r\ntest_4 (__main__.TestSequenceFunctions) ... Chose the data file which holds results from bam file. This should be a text file.\r\nType the gene name you are looking at. Example: 'ACACA' : NUF2\r\nWhat is the mapping quality needed for a read to be significant. Enter 0 for every read: 0\r\nType the amount of coverage you would like have a sequence considered methylated. Example: 5 (which means regions have 5 overlapping sequences are saved) : 5\r\nDone\r\nYour results are in your desktop folder named Methylated Results\r\nok\r\ntest_5 (__main__.TestSequenceFunctions) ... Enter the type of BLAST that you want to perform. Types of BLAST can be found on the NCBI website. Example: blastn : blastn\r\nName of the database you are searching. Would be the database you made with the MakeDatabase() function. Example: Sequences : NUF2\r\nWent through 1 data files\r\nCounted 1 events\r\nok\r\ntest_6 (__main__.TestSequenceFunctions) ... Type the methylated gene you are looking at: NUF2\r\nok\r\ntest_7 (__main__.TestSequenceFunctions) ... ok\r\n\r\n----------------------------------------------------------------------\r\nRan 7 tests in 52.829s\r\n\r\nOK", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://pypi.python.org/pypi/PyMethyl", "keywords": "Methylation patterns", "license": "LICENSE.txt", "maintainer": null, "maintainer_email": null, "name": "PyMethyl", "package_url": "https://pypi.org/project/PyMethyl/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/PyMethyl/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://pypi.python.org/pypi/PyMethyl" }, "release_url": "https://pypi.org/project/PyMethyl/0.1.3/", "requires_dist": null, "requires_python": null, "summary": "Tool to determine methylation patterns.", "version": "0.1.3" }, "last_serial": 1184120, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "2800ebb8181c2159f5746583d0ff1a14", "sha256": "49dc643bc2701fd36d348c54b2f6b633877e0dd6111bd22e2ada191587da9a4f" }, "downloads": -1, "filename": "PyMethyl-0.1.0.zip", "has_sig": false, "md5_digest": "2800ebb8181c2159f5746583d0ff1a14", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 38196, "upload_time": "2014-08-07T23:18:48", "url": "https://files.pythonhosted.org/packages/a3/7e/5b7aeef8213f60ab35b3d64f0777f3a8b722b66d0bda325b09769f409f96/PyMethyl-0.1.0.zip" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "bc2dd793ca8dacb17664693938b8f5e2", "sha256": "598fb40246b1cd5227ee2bfea3570883fb0e709f3ac2b4ea48af02911e3beeae" }, "downloads": -1, "filename": "PyMethyl-0.1.1.zip", "has_sig": false, "md5_digest": "bc2dd793ca8dacb17664693938b8f5e2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 38191, "upload_time": "2014-08-07T23:21:33", "url": "https://files.pythonhosted.org/packages/9f/27/8847eb5fd5435d7616b9b308b0d620208d32d293d089e9a161beaebb4168/PyMethyl-0.1.1.zip" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "20e227f07f4c9d4d839c82f1255668ce", "sha256": "446778982c9055bcdde3080665734313264b31a620eb075f8a063398c4ad086e" }, "downloads": -1, "filename": "PyMethyl-0.1.2.tar.gz", "has_sig": false, "md5_digest": "20e227f07f4c9d4d839c82f1255668ce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30981, "upload_time": "2014-08-08T14:58:37", "url": "https://files.pythonhosted.org/packages/b2/de/b5902296d164e7efebf54b98a27a97025aeb714012b9b7b0cfb40d25f7fa/PyMethyl-0.1.2.tar.gz" }, { "comment_text": "", "digests": { "md5": "1e4f152720c83ad2c9580d9e184f05d7", "sha256": "6f27e27dd08073f89852983a6d696059cb1f2f56e0ad62a692ceef407b301651" }, "downloads": -1, "filename": "PyMethyl-0.1.2.zip", "has_sig": false, "md5_digest": "1e4f152720c83ad2c9580d9e184f05d7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 38191, "upload_time": "2014-08-07T23:23:49", "url": "https://files.pythonhosted.org/packages/94/dd/464e74fe0aba44f809e4819b60537882a76788e7582f44770436287be625/PyMethyl-0.1.2.zip" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "4ee7a1798582852afc272c6b192eefbf", "sha256": "ce228a4f062455e8086162957b23615a5686a03e18552222669a927b570889b8" }, "downloads": -1, "filename": "PyMethyl-0.1.3.tar.gz", "has_sig": false, "md5_digest": "4ee7a1798582852afc272c6b192eefbf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30960, "upload_time": "2014-08-08T15:01:52", "url": "https://files.pythonhosted.org/packages/3d/12/68ebac1a175166fc3ca70bccf054cfae9b0799b9459cc75d59775483af6d/PyMethyl-0.1.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4ee7a1798582852afc272c6b192eefbf", "sha256": "ce228a4f062455e8086162957b23615a5686a03e18552222669a927b570889b8" }, "downloads": -1, "filename": "PyMethyl-0.1.3.tar.gz", "has_sig": false, "md5_digest": "4ee7a1798582852afc272c6b192eefbf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30960, "upload_time": "2014-08-08T15:01:52", "url": "https://files.pythonhosted.org/packages/3d/12/68ebac1a175166fc3ca70bccf054cfae9b0799b9459cc75d59775483af6d/PyMethyl-0.1.3.tar.gz" } ] }