{ "info": { "author": "Vit Novacek, DERI, NUIG", "author_email": "vit.novacek@deri.org", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "Environment :: Web Environment", "Intended Audience :: Education", "Intended Audience :: End Users/Desktop", "Intended Audience :: Healthcare Industry", "Intended Audience :: Information Technology", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)", "Operating System :: OS Independent", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Scientific/Engineering :: Bio-Informatics", "Topic :: Scientific/Engineering :: Information Analysis", "Topic :: Text Processing :: Indexing", "Topic :: Text Processing :: Linguistic" ], "description": "==============================\r\n SKIMMR_BM - a Brief Overview\r\n ==============================\r\n \r\n -----------------------------------------------------\r\n Contact vit.novacek@deri.org for more detailed info\r\n -----------------------------------------------------\r\n \r\n 0. ABSTRACT AND TABLE OF CONTENTS\r\n =================================\r\n \r\n This document provides basic information about SKIMMR (a tool for \r\n machine-aided skim reading), in particular about its SKIMMR_BM package version\r\n that focuses on biomedical texts. \r\n \r\n The document contains three sections:\r\n \r\n 1. ABOUT - overview of the tool and its functionalities\r\n \r\n 2. INSTALLATION - basic instructions on how to install SKIMMR\r\n \r\n 3. USING SKIMMR - basic instruction on how to use it after installation\r\n \r\n 1. ABOUT\r\n ========\r\n \r\n SKIMMR is a research prototype aimed at helping users to navigate through \r\n large amounts of textual data efficiently. This is done by extending the \r\n traditional paradigm of searching and browsing of text collections. SKIMMR \r\n lets the user skim texts by navigating a network of concepts and relations \r\n explicitly or implicitly present in them. The concepts and their relations \r\n have been extracted and inferred from the textual content using novel machine \r\n reading techniques that power the SKIMMR back-end.\r\n \r\n The interconnected `skimming networks' provide a high-level overview of the \r\n domain covered by the texts, and let the user quickly discover interesting \r\n pieces of information. This process also largely reduces the burden of \r\n sieving through lots of irrelevant resources, which is often the down side \r\n of using the standard search engines. When the users identify interesting \r\n information within the high level overview, they can continue reading the \r\n related textual resources in detail. \r\n \r\n This particular version of SKIMMR (SKIMMR_BM) focuses on biomedical articles \r\n available on PubMed. The knowledge base exposed by the SKIMMR interface is \r\n extracted from the PubMed abstracts (see the SKIMMR back-end documentation for \r\n details). Once the users finish skimming, i.e., navigating the network of \r\n concepts and relations extracted from the PubMed articles, they can easily \r\n browse and read the related publications. This is done using an embedded \r\n window with PubMed results focused on the articles most relevant to the \r\n concepts discovered when skimming the data. Specific examples that illustrate \r\n the way SKIMMR works are provided in the following section. \r\n \r\n 2. INSTALLATION\r\n ===============\r\n \r\n Probably the easiest way of installing SKIMMR_BM is using easy_install:\r\n \r\n \t*easy_install skimmr_bm*\r\n \r\n Check the documentation at http://peak.telecommunity.com/DevCenter/EasyInstall\r\n for more detailed info on easy_install and setuptools.\r\n \r\n Should you prefer to download and install the package manually, fetch the\r\n SKIMMR_BM distribution archive file first. After unpacking it, switch to the \r\n generated directory and execute the following command:\r\n \r\n \t*python setup.py install*\r\n \r\n If you want to install the package locally (for the current user only), use \r\n the following:\r\n \r\n \t*python setup.py install --user*\r\n \r\n Check the documentation at http://docs.python.org/2/distutils/index.html for \r\n more detailed options.\r\n \r\n 3. USING SKIMMR\r\n ===============\r\n \r\n After downloading and installing the SKIMMR package, it can be readily used\r\n in a basic manner through the scripts provided. These are:\r\n \r\n - dwnl_bm.py - download of the PubMed abstracts to be processed\r\n \r\n - exst_bm.py - extraction of co-occurrence statements from texts\r\n \r\n - crkb_bm.py - creation of a knowledge base and its population by semantic \r\n similarity relations\r\n \r\n - ixkb_bm.py - indexing of the knowledge base for efficient querying\r\n \r\n - prep_bm.py - preparation of the sub-folder structure and resources in the \r\n working directory necessary in order to launch the SKIMMR server\r\n \r\n - srvr_bm.py - launching the SKIMMR server and UI\r\n \r\n The scripts are located in the *bin* subdirectory of the installation package.\r\n Alternatively, you can copy them from wherever your system puts Python package\r\n binaries (check the documentation of your operating system and your local \r\n Python implementation).\r\n \r\n The typical way of using these scripts is summarised in the following sections.\r\n Note that there are other ways how to launch the scripts - you can check the \r\n documentation in the script source code for details.\r\n \r\n 3.1 Creating the working directory\r\n ----------------------------------\r\n \r\n First of all, SKIMMR requires a place to store and process its data. Create a\r\n directory for that somewhere (let us assume it is called *skimmr* in the \r\n following). Switch to that directory then and copy all the SKIMMR_BM scripts \r\n there. After that, run\r\n \r\n \t*python prep_bm.py*\r\n \r\n in there. This will generate three sub-directories, *data*, *lingpipe* and \r\n *text*, as well as a couple of files and directories deeper in the *data* one. \r\n By default, the LingPipe text mining software is downloaded by this script,\r\n as it is a preferred method for the biomedical knowledge extraction later\r\n on. You can skip this, however, you need to copy the LingPipe software into\r\n the *lingpipe* directory yourselves then, or use a general knowledge extraction\r\n pipe-line built into the SKIMMR system (see the detailed options documented\r\n in the script sources). \r\n \r\n 3.2 Downloading abstracts from PubMed\r\n -------------------------------------\r\n \r\n SKIMMR_BM is specifically meant to process textual content available via the\r\n PubMed repository. After placing one or more files with the *.pmid* extension\r\n into the *text* folder in the *skimmr* directory, the tool can automatically \r\n download all the abstracts corresponding to the PubMed identifiers provided\r\n in the *.pmid* files (divided by any sequence of white space characters or\r\n commas). This is done by running the following\r\n \r\n \t*python dwnl_bm.py E-MAIL*\r\n \r\n where E-MAIL is your e-mail (used for the purposes of the Entrez API used by \r\n SKIMMR to fetch the article data from PubMed).\r\n \r\n 3.3 Processing the texts\r\n ------------------------\r\n \r\n If you have not downloaded PubMed abstracts based on their IDs in the previous\r\n step, copy the locally stored abstracts files you want to process to the \r\n *text* folder in the *skimmr* directory. Plain text files (in ASCII or Unicode \r\n format) are supported, with the *.txt* extensions. Note that the filenames \r\n should be the PubMed IDs of the corresponding articles (plus the *.txt* \r\n extension), otherwise the source article look-up in SKIMMR later on will not\r\n produce any meaningful results.\r\n \r\n After you have all the texts in place, run\r\n \r\n \t*python exst_bm.py*\r\n \r\n in the *skimmr* directory. This will chop up the texts into paragraphs and\r\n extract the co-occurrence statements from them. There is a limit imposed\r\n on the number of produced statements in the exst.py script, dynamically\r\n computed from the available memory (or set to 750,000 if the psutil package\r\n is not available on your system). You can change that when using the SKIMMR\r\n library functions directly. \r\n \r\n 3.4 Creating the knowledge base\r\n -------------------------------\r\n \r\n After generating the co-occurrence statements in the previous step, you can\r\n create the knowledge base from them using \r\n \r\n \t*python crkb_bm.py create*\r\n \r\n which will generate a couple of knowledge base persistence files in the *stre*\r\n sub-folder of the *data* directory in the *skimmr* root folder.\r\n \r\n 3.5 Computing similarities\r\n --------------------------\r\n \r\n When the knowledge base has been generated, you can augment it by computing\r\n semantic similarity relationships between the terms that are more frequent \r\n than average:\r\n \r\n \t*python crkb_bm.py compsim*\r\n \r\n This will update the knowledge base persistence files accordingly. Note that \r\n this step may take up to several hours for larger knowledge bases!\r\n \r\n 3.6 Indexing the knowledge base\r\n -------------------------------\r\n \r\n Before you can expose the processed content via a SKIMMR web interface, you\r\n have to index the knowledge base. This is done by running\r\n \r\n \t*python ixkb_bm.py*\r\n \r\n which will generate a couple of index files in the knowledge base persistence\r\n sub-directory.\r\n \r\n 3.7 Launching and using the server\r\n ----------------------------------\r\n \r\n At last, you can launch the SKIMMR server by\r\n \r\n \t*python srvr_bm.py*\r\n \r\n This will start the server at localhost (127.0.0.1) and port 8008. You can \r\n specify alternative addresses and ports by running the server as\r\n \r\n \t*python srvr_bm.py ADDRESS:PORT*\r\n \r\n Also, you can specify an alternative store to be loaded by the server (useful\r\n if you want to examine multiple stores you have previously generated):\r\n \r\n \t*python srvr_bm.py [ADDRESS:PORT] [FOLDER]*\r\n \r\n where FOLDER is a path to the store you want to load.\r\n \r\n After the SKIMMR server has been started, you can point you browser to the \r\n corresponding address and port and start using the tool as indicated in the\r\n *About* web-page accessible from the SKIMMR interface (just follow the link \r\n in the bottom of every page in the SKIMMR web interface).", "description_content_type": null, "docs_url": null, "download_url": "https://dl.dropbox.com/u/21379226/software/skimmr_bm-0.1-a2.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://www.skimmr.org", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "skimmr_bm", "package_url": "https://pypi.org/project/skimmr_bm/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/skimmr_bm/", "project_urls": { "Download": "https://dl.dropbox.com/u/21379226/software/skimmr_bm-0.1-a2.tar.gz", "Homepage": "http://www.skimmr.org" }, "release_url": "https://pypi.org/project/skimmr_bm/0.1-a2/", "requires_dist": null, "requires_python": null, "summary": "SKIMMR library and scipts for machine-aided skim-reading (the specific package version for biomedical texts)", "version": "0.1-a2" }, "last_serial": 803341, "releases": { "0.1-a1": [ { "comment_text": "", "digests": { "md5": "da8052834ae7e6391a659e12b8c83dbb", "sha256": "8ca4ac67100c3c36230205fb86dee33801cf2491218daeac2eae40e983e4e548" }, "downloads": -1, "filename": "skimmr_bm-0.1-a1.tar.gz", "has_sig": false, "md5_digest": "da8052834ae7e6391a659e12b8c83dbb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 812782, "upload_time": "2013-02-15T17:29:15", "url": "https://files.pythonhosted.org/packages/e9/4a/5eac799cb67b4cbd3e6d03abad24f5d367d3345dc5ea865c2ba893458fc1/skimmr_bm-0.1-a1.tar.gz" } ], "0.1-a2": [ { "comment_text": "", "digests": { "md5": "f68b0953ab851cc455d66132583bd304", "sha256": "840915e2703abec5048e179d49a9d6c72656428936d2bf28753ba39d0435965b" }, "downloads": -1, "filename": "skimmr_bm-0.1-a2.tar.gz", "has_sig": false, "md5_digest": "f68b0953ab851cc455d66132583bd304", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 812784, "upload_time": "2013-02-15T17:30:08", "url": "https://files.pythonhosted.org/packages/a7/92/1419e602dd0c5ad2711baa11370f30abd1c4758ab2471fc1cb175b7f3635/skimmr_bm-0.1-a2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f68b0953ab851cc455d66132583bd304", "sha256": "840915e2703abec5048e179d49a9d6c72656428936d2bf28753ba39d0435965b" }, "downloads": -1, "filename": "skimmr_bm-0.1-a2.tar.gz", "has_sig": false, "md5_digest": "f68b0953ab851cc455d66132583bd304", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 812784, "upload_time": "2013-02-15T17:30:08", "url": "https://files.pythonhosted.org/packages/a7/92/1419e602dd0c5ad2711baa11370f30abd1c4758ab2471fc1cb175b7f3635/skimmr_bm-0.1-a2.tar.gz" } ] }