{
"info": {
"author": "Luca Soldaini",
"author_email": "luca@ir.cs.georgetown.edu",
"bugtrack_url": null,
"classifiers": [
"Development Status :: 5 - Production/Stable",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 2",
"Programming Language :: Python :: 3",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Topic :: Scientific/Engineering :: Bio-Informatics"
],
"description": "[**NEW: v.1.3 is pip-ready!**](https://giphy.com/embed/BlVnrxJgTGsUw) You can now install QuickUMLS through a simple `pip install quickumls`.\n\n# QuickUMLS\n\nQuickUMLS (Soldaini and Goharian, 2016) is a tool for fast, unsupervised biomedical concept extraction from medical text.\nIt takes advantage of [Simstring](http://www.chokkan.org/software/simstring/) (Okazaki and Tsujii, 2010) for approximate string matching.\nFor more details on how QuickUMLS works, we remand to our paper.\n\nThis project should be compatible with Python 3 (Python 2 is [no longer supported](https://pythonclock.org/)) and run on any UNIX system (support for Windows is experimental, please report bugs!). **If you find any bugs, please file an issue on GitHub or email the author at luca@ir.cs.georgetown.edu**.\n\n## Installation\n\n1. **Obtain a UMLS installation** This tool requires you to have a valid UMLS installation on disk. To install UMLS, you must first obtain a [license](https://uts.nlm.nih.gov/license.html) from the National Library of Medicine; then you should download all UMLS files from [this page](https://www.nlm.nih.gov/research/umls/licensedcontent/umlsknowledgesources.html); finally, you can install UMLS using the [MetamorphoSys](https://www.nlm.nih.gov/pubs/factsheets/umlsmetamorph.html) tool as [explained in this guide](https://www.nlm.nih.gov/research/umls/implementation_resources/metamorphosys/help.html). The installation can be removed once the system has been initialized.\n2. **Install QuickUMLS**: You can do so by either running `pip install quickumls` or `python setup.py install`. On macOS, using anaconda is **strongly recommended**\u2020. \n3. **Obrain a SpaCy corpus**: After you install QuickUMLS and its dependencies, you should be able to do so by running `python -m spacy download en`.\n3. **Create a QuickUMLS installation** Initialize the system by running `python -m quickumls.install `, where `` is where the installation files are (in particular, we need `MRCONSO.RRF` and `MRSTY.RRF`) and `` is the directory where the QuickUmls data files should be installed. This process will take between 5 and 30 minutes depending how fast the CPU and the drive where UMLS and QuickUMLS files are stored are (on a system with a Intel i7 6700K CPU and a 7200 RPM hard drive, initialization takes 8.5 minutes). `python -m quickumls.install` supports the following optional arguments:\n - `-L` / `--lowercase`: if used, all concept terms are folded to lowercase before being processed. This option typically increases recall, but it might reduce precision;\n - `-U` / `--normalize-unicode`: if used, expressions with non-ASCII characters are converted to the closest combination of ASCII characters.\n - `-E` / `--language`: Specify the language to consider for UMLS concepts; by default, English is used. For a complete list of languages, please see [this table provided by NLM](https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/release/abbreviations.html#LAT).\n\n\n**\u2020**: If the installation fails on macOS when using Anaconda, install `leveldb` first by running `conda install -c conda-forge python-leveldb`.\n\n## APIs\n\nA QuickUMLS object can be instantiated as follows:\n\n```python\nfrom quickumls import QuickUMLS\n\nmatcher = QuickUMLS(quickumls_fp, overlapping_criteria, threshold,\n similarity_name, window, accepted_semtypes)\n```\n\nWhere:\n\n- `quickumls_fp` is the directory where the QuickUMLS data files are installed.\n- `overlapping_criteria` (optional, default: \"score\") is the criteria used to deal with overlapping concepts; choose \"score\" if the matching score of the concepts should be consider first, \"length\" if the longest should be considered first instead.\n- `threshold` (optional, default: 0.7) is the minimum similarity value between strings.\n- `similarity_name` (optional, default: \"jaccard\") is the name of similarity to use. Choose between \"dice\", \"jaccard\", \"cosine\", or \"overlap\".\n- `window` (optional, default: 5) is the maximum number of tokens to consider for matching.\n- `accepted_semtypes` (optional, default: see `constants.py`) is the set of UMLS semantic types concepts should belong to. Semantic types are identified by the letter \"T\" followed by three numbers (e.g., \"T131\", which identifies the type *\"Hazardous or Poisonous Substance\"*). See [here](https://metamap.nlm.nih.gov/Docs/SemanticTypes_2013AA.txt) for the full list.\n\nTo use the matcher, simply call\n\n```python\ntext = \"The ulna has dislocated posteriorly from the trochlea of the humerus.\"\nmatcher.match(text, best_match=True, ignore_syntax=False)\n```\n\nSet `best_match` to `False` if you want to return overlapping candidates, `ignore_syntax` to `True` to disable all heuristics introduced in (Soldaini and Goharian, 2016).\n\n\n## Server / Client Support\n\nStarting with v.1.2, QuickUMLS includes a support for being used in a client-server configuration. That is, you can start one QuickUMLS server, and query it from multiple scripts using a client.\n\nTo start the server, run `python -m quickumls.server`:\n\n```bash\npython -m quickumls.server /path/to/quickumls/files {-P QuickUMLS port} {-H QuickUMLS host} {QuickUMLS options}\n```\n\nHost and port are optional; by default, QuickUMLS runs on `localhost:4645`. You can also pass any QuickUMLS option mentioned above to the server. To obtain a list of options for the server, run `python -m quickumls.server -h`.\n\nTo load the client, import `get_quickumls_client` from `quickumls`:\n\n```bash\nfrom quickumls import get_quickumls_client\nmatcher = get_quickumls_client()\ntext = \"The ulna has dislocated posteriorly from the trochlea of the humerus.\"\nmatcher.match(text, best_match=True, ignore_syntax=False)\n```\n\nThe API of the client is the same of a QuickUMLS object.\n\n\nIn case you wish to run the server in the background, you can do so as follows:\n\n```bash\nnohup python -m quickumls.server /path/to/QuickUMLS {server options} > /dev/null 2>&1 & echo $! > nohup.pid\n\n```\n\nWhen you are done, don't forget to stop the server by running.\n```bash\nkill -9 `cat nohup.pid`\nrm nohup.pid\n```\n\n## References\n\n- Okazaki, Naoaki, and Jun'ichi Tsujii. \"*Simple and efficient algorithm for approximate dictionary matching.*\" COLING 2010.\n- Luca Soldaini and Nazli Goharian. \"*QuickUMLS: a fast, unsupervised approach for medical concept extraction.*\" MedIR Workshop, SIGIR 2016.",
"description_content_type": "text/markdown",
"docs_url": null,
"download_url": "",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://github.com/Georgetown-IR-Lab/QuickUMLS",
"keywords": "",
"license": "MIT",
"maintainer": "",
"maintainer_email": "",
"name": "quickumls",
"package_url": "https://pypi.org/project/quickumls/",
"platform": "",
"project_url": "https://pypi.org/project/quickumls/",
"project_urls": {
"Homepage": "https://github.com/Georgetown-IR-Lab/QuickUMLS"
},
"release_url": "https://pypi.org/project/quickumls/1.3.0.post4/",
"requires_dist": null,
"requires_python": "",
"summary": "QuickUMLS is a tool for fast, unsupervised biomedical concept extraction from medical text",
"version": "1.3.0.post4"
},
"last_serial": 5418163,
"releases": {
"1.3.0.post4": [
{
"comment_text": "",
"digests": {
"md5": "10e86e72a32d9f62230f2fa162cccc6e",
"sha256": "c5c6547e382e015921c44f31e2cf66703c1a972587f49d826415f719da5f3e91"
},
"downloads": -1,
"filename": "quickumls-1.3.0.post4.tar.gz",
"has_sig": false,
"md5_digest": "10e86e72a32d9f62230f2fa162cccc6e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20993,
"upload_time": "2019-06-19T02:31:57",
"url": "https://files.pythonhosted.org/packages/ee/bf/ff99e77ee8f0ae878cefdc68e77228124362e73675c0b4c8f082f61f396d/quickumls-1.3.0.post4.tar.gz"
}
],
"1.3.post1": [
{
"comment_text": "",
"digests": {
"md5": "83509ac04f6e13b6d4e8f96da34f40c9",
"sha256": "92b350d6a4c525be72db4cc49d7966c04d860b55147f5d77cb35c7195a500e86"
},
"downloads": -1,
"filename": "quickumls-1.3.post1.tar.gz",
"has_sig": false,
"md5_digest": "83509ac04f6e13b6d4e8f96da34f40c9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20879,
"upload_time": "2019-06-18T04:07:32",
"url": "https://files.pythonhosted.org/packages/aa/b9/864222750de6287f9c12985f6db11b7908005c74472b4012cc99f36b48b9/quickumls-1.3.post1.tar.gz"
}
],
"1.3.post2": [
{
"comment_text": "",
"digests": {
"md5": "4d835352bcc38688729b2a399c9b9f54",
"sha256": "7dfed353f7eeca8695943374d3787233d969f695905964db5d3fc65d901a9aa9"
},
"downloads": -1,
"filename": "quickumls-1.3.post2.tar.gz",
"has_sig": false,
"md5_digest": "4d835352bcc38688729b2a399c9b9f54",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20879,
"upload_time": "2019-06-18T04:24:37",
"url": "https://files.pythonhosted.org/packages/a1/5c/75dc51f405b0329428147330cdafa0b779e71748055a8724868471dbf134/quickumls-1.3.post2.tar.gz"
}
],
"1.3.post3": [
{
"comment_text": "",
"digests": {
"md5": "8fb581d33733f1737caa1a19ac409cbe",
"sha256": "94a95ed2aa41f0ad915ea4770477337c1939f52b4b842cf844696b05863a1912"
},
"downloads": -1,
"filename": "quickumls-1.3.post3.tar.gz",
"has_sig": false,
"md5_digest": "8fb581d33733f1737caa1a19ac409cbe",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20902,
"upload_time": "2019-06-18T04:30:22",
"url": "https://files.pythonhosted.org/packages/17/cf/fa547fd92dbb99d2577b360898e68ff095091e88aac2a9662e436f855f08/quickumls-1.3.post3.tar.gz"
}
]
},
"urls": [
{
"comment_text": "",
"digests": {
"md5": "10e86e72a32d9f62230f2fa162cccc6e",
"sha256": "c5c6547e382e015921c44f31e2cf66703c1a972587f49d826415f719da5f3e91"
},
"downloads": -1,
"filename": "quickumls-1.3.0.post4.tar.gz",
"has_sig": false,
"md5_digest": "10e86e72a32d9f62230f2fa162cccc6e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20993,
"upload_time": "2019-06-19T02:31:57",
"url": "https://files.pythonhosted.org/packages/ee/bf/ff99e77ee8f0ae878cefdc68e77228124362e73675c0b4c8f082f61f396d/quickumls-1.3.0.post4.tar.gz"
}
]
}