{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [], "description": "# NeuroNER\n\n[![Build Status](https://travis-ci.org/Franck-Dernoncourt/NeuroNER.svg?branch=master)](https://travis-ci.org/Franck-Dernoncourt/NeuroNER)\n\nNeuroNER is a program that performs named-entity recognition (NER). Website: [neuroner.com](http://neuroner.com).\n\nThis page gives step-by-step instructions to install and use NeuroNER.\n\n\n## Table of Contents\n\n\n\n- [Requirements](#requirements)\n- [Installation](#installation)\n- [Using NeuroNER](#using-neuroner)\n * [Adding a new dataset](#adding-a-new-dataset)\n * [Using a pretrained model](#using-a-pretrained-model)\n * [Sharing a pretrained model](#sharing-a-pretrained-model)\n * [Using TensorBoard](#using-tensorboard)\n- [Citation](#citation)\n\n\n\n## Requirements\n\nNeuroNER relies on Python 3, TensorFlow 1.0+, and optionally on BRAT:\n\n- Python 3: NeuroNER does not work with Python 2.x. On Windows, it has to be Python 3.6 64-bit or later.\n- TensorFlow is a library for machine learning. NeuroNER uses it for its NER engine, which is based on neural networks. Official website: [https://www.tensorflow.org](https://www.tensorflow.org)\n- BRAT (optional) is a web-based annotation tool. It only needs to be installed if you wish to conveniently create annotations or view the predictions made by NeuroNER. Official website: [http://brat.nlplab.org](http://brat.nlplab.org)\n\n## Installation\n\nFor GPU support, [GPU requirements for Tensorflow](https://www.tensorflow.org/install/) must be satisfied. If your system does not meet these requirements, you should use the CPU version. To install neuroner:\n\n```\n# For CPU support (no GPU support):\npip3 install pyneuroner[cpu]\n\n# For GPU support:\npip3 install pyneuroner[gpu]\n```\n\nYou will also need to download some support packages.\n\n1. The English language module for Spacy:\n\n```\n# Download the SpaCy English module\npython -m spacy download en\n```\n\n2. Download word embeddings from http://neuroner.com/data/word_vectors/glove.6B.100d.zip, unzip them to the folder `./data/word_vectors`\n\n```\n# Get word embeddings\nwget -P data/word_vectors http://neuroner.com/data/word_vectors/glove.6B.100d.zip\nunzip data/word_vectors/glove.6B.100d.zip -d data/word_vectors/\n```\n\n3. Load sample datasets. These can be loaded by calling the `neuromodel.fetch_data()` function from a Python interpreter or with the `--fetch_data` argument at the command line.\n\n```\n# Load a dataset from the command line\nneuroner --fetch_data=conll2003\nneuroner --fetch_data=example_unannotated_texts\nneuroner --fetch_data=i2b2_2014_deid\n```\n\n```\n# Load a dataset from a Python interpreter\nfrom neuroner import neuromodel\nneuromodel.fetch_data('conll2003')\nneuromodel.fetch_data('example_unannotated_texts')\nneuromodel.fetch_data('i2b2_2014_deid')\n```\n\n4. Load a pretrained model. The models can be loaded by calling the `neuromodel.fetch_model()` function from a Python interpreter or with the `--fetch_trained_models` argument at the command line.\n\n```\n# Load a pre-trained model from the command line\nneuroner --fetch_trained_model=conll_2003_en\nneuroner --fetch_trained_model=i2b2_2014_glove_spacy_bioes\nneuroner --fetch_trained_model=i2b2_2014_glove_stanford_bioes\nneuroner --fetch_trained_model=mimic_glove_spacy_bioes\nneuroner --fetch_trained_model=mimic_glove_stanford_bioes\n```\n\n```\n# Load a pre-trained model from a Python interpreter\nfrom neuroner import neuromodel\nneuromodel.fetch_model('conll_2003_en')\nneuromodel.fetch_model('i2b2_2014_glove_spacy_bioes')\nneuromodel.fetch_model('i2b2_2014_glove_stanford_bioes')\nneuromodel.fetch_model('mimic_glove_spacy_bioes')\nneuromodel.fetch_model('mimic_glove_stanford_bioes')\n```\n\n### Installing BRAT (optional) \n\nBRAT is a tool that can be used to create, change or view the BRAT-style annotations. For installation and usage instructions, see the [BRAT website](http://brat.nlplab.org/installation.html).\n\n### Installing Perl (platform dependent)\n\nPerl is required because the official CoNLL-2003 evaluation script is written in this language: http://strawberryperl.com. For Unix and Mac OSX systems, Perl should already be installed. For Windows systems, you may need to install it.\n\n## Using NeuroNER\n\nNeuroNER can either be run from the command line or from a Python interpreter.\n\n### Using NeuroNer from a Python interpreter\n\nTo use NeuroNER from the command line, create an instance of the neuromodel with your desired arguments, and then call the relevant methods. Additional parameters can be set from a `parameters.ini` file in the working directory. For example:\n\n```\nfrom neuroner import neuromodel\nnn = neuromodel.NeuroNER(train_model=False, use_pretrained_model=True)\n```\n\nMore detail to follow.\n\n### Using NeuroNer from the command line\n\nBy default NeuroNER is configured to train and test on the CoNLL-2003 dataset. Running neuroner with the default settings starts training on the CoNLL-2003 dataset (the F1-score on the test set should be around 0.90, i.e. on par with state-of-the-art systems). To start the training:\n\n```\n# To use the CPU if you have installed tensorflow, or use the GPU if you have installed tensorflow-gpu:\nneuroner\n\n# To use the CPU only if you have installed tensorflow-gpu:\nCUDA_VISIBLE_DEVICES=\"\" neuroner\n\n# To use the GPU 1 only if you have installed tensorflow-gpu:\nCUDA_VISIBLE_DEVICES=1 neuroner\n```\n\nIf you wish to change any of NeuroNER parameters, you can modify the [`parameters.ini`](parameters.ini) configuration file in your working directory or specify it as an argument.\n\nFor example, to reduce the number of training epochs and not use any pre-trained token embeddings:\n\n```\nneuroner --maximum_number_of_epochs=2 --token_pretrained_embedding_filepath=\"\"\n```\n\nTo perform NER on some plain texts using a pre-trained model:\n\n```\nneuroner --train_model=False --use_pretrained_model=True --dataset_text_folder=./data/example_unannotated_texts --pretrained_model_folder=./trained_models/conll_2003_en\n```\n\nIf a parameter is specified in both the [`parameters.ini`](parameters.ini) configuration file and as an argument, then the argument takes precedence (i.e., the parameter in [`parameters.ini`](parameters.ini) is ignored). You may specify a different configuration file with the `--parameters_filepath` command line argument. The command line arguments have no default value except for `--parameters_filepath`, which points to [`parameters.ini`](parameters.ini).\n\nNeuroNER has 3 modes of operation:\n\n- training mode (from scratch): the dataset folder must have train and valid sets. Test and deployment sets are optional.\n- training mode (from pretrained model): the dataset folder must have train and valid sets. Test and deployment sets are optional.\n- prediction mode (using pretrained model): the dataset folder must have either a test set or a deployment set.\n\n### Adding a new dataset\n\nA dataset may be provided in either CoNLL-2003 or BRAT format. The dataset files and folders should be organized and named as follows:\n\n- Training set: `train.txt` file (CoNLL-2003 format) or `train` folder (BRAT format). It must contain labels.\n- Validation set: `valid.txt` file (CoNLL-2003 format) or `valid` folder (BRAT format). It must contain labels.\n- Test set: `test.txt` file (CoNLL-2003 format) or `test` folder (BRAT format). It must contain labels.\n- Deployment set: `deploy.txt` file (CoNLL-2003 format) or `deploy` folder (BRAT format). It shouldn't contain any label (if it does, labels are ignored).\n\nWe provide several examples of datasets:\n\n- [`data/conll2003/en`](data/conll2003/en): annotated dataset with the CoNLL-2003 format, containing 3 files (`train.txt`, `valid.txt` and `test.txt`).\n- [`data/example_unannotated_texts`](data/example_unannotated_texts): unannotated dataset with the BRAT format, containing 1 folder (`deploy/`). Note that the BRAT format with no annotation is the same as plain texts.\n\n### Using a pretrained model\n\nIn order to use a pretrained model, the `pretrained_model_folder` parameter in the [`parameters.ini`](parameters.ini) configuration file must be set to the folder containing the pretrained model. The following parameters in the [`parameters.ini`](parameters.ini) configuration file must also be set to the same values as in the configuration file located in the specified `pretrained_model_folder`:\n\n```\nuse_character_lstm\ncharacter_embedding_dimension\ncharacter_lstm_hidden_state_dimension\ntoken_pretrained_embedding_filepath\ntoken_embedding_dimension\ntoken_lstm_hidden_state_dimension\nuse_crf\ntagging_format\ntokenizer\n```\n\n### Sharing a pretrained model\n\nYou are highly encouraged to share a model trained on their own datasets, so that other users can use the pretrained model on other datasets. We provide the [`neuroner/prepare_pretrained_model.py`](neuroner/prepare_pretrained_model.py) script to make it easy to prepare a pretrained model for sharing. In order to use the script, one only needs to specify the `output_folder_name`, `epoch_number`, and `model_name` parameters in the script.\n\nBy default, the only information about the dataset contained in the pretrained model is the list of tokens that appears in the dataset used for training and the corresponding embeddings learned from the dataset.\n\nIf you wish to share a pretrained model without providing any information about the dataset (including the list of tokens appearing in the dataset), you can do so by setting\n\n```delete_token_mappings = True```\n\nwhen running the script. In this case, it is highly recommended to use some external pre-trained token embeddings and freeze them while training the model to obtain high performance. This can be done by specifying the `token_pretrained_embedding_filepath` and setting\n\n```freeze_token_embeddings = True```\n\nin the [`parameters.ini`](parameters.ini) configuration file during training.\n\nIn order to share a pretrained model, please [submit a new issue](https://github.com/Franck-Dernoncourt/NeuroNER/issues/new) on the GitHub repository.\n\n### Using TensorBoard\n\nYou may launch TensorBoard during or after the training phase. To do so, run in the terminal from the NeuroNER folder:\n```\ntensorboard --logdir=output\n```\n\nThis starts a web server that is accessible at http://127.0.0.1:6006 from your web browser.\n\n## Citation\n\nIf you use NeuroNER in your publications, please cite this [paper](https://arxiv.org/abs/1705.05487):\n\n```\n@article{2017neuroner,\n title={{NeuroNER}: an easy-to-use program for named-entity recognition based on neural networks},\n author={Dernoncourt, Franck and Lee, Ji Young and Szolovits, Peter},\n journal={Conference on Empirical Methods on Natural Language Processing (EMNLP)},\n year={2017}\n}\n```\n\nThe neural network architecture used in NeuroNER is described in this [article](https://arxiv.org/abs/1606.03475):\n\n```\n@article{2016deidentification,\n title={De-identification of Patient Notes with Recurrent Neural Networks},\n author={Dernoncourt, Franck and Lee, Ji Young and Uzuner, Ozlem and Szolovits, Peter},\n journal={Journal of the American Medical Informatics Association (JAMIA)},\n year={2016}\n}\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Franck-Dernoncourt/NeuroNER", "keywords": "Named-entity recognition using neural networks", "license": "", "maintainer": "", "maintainer_email": "", "name": "pyneuroner", "package_url": "https://pypi.org/project/pyneuroner/", "platform": "", "project_url": "https://pypi.org/project/pyneuroner/", "project_urls": { "Homepage": "https://github.com/Franck-Dernoncourt/NeuroNER" }, "release_url": "https://pypi.org/project/pyneuroner/1.0.8/", "requires_dist": [ "matplotlib (>=3.0.2)", "networkx (>=2.2)", "pycorenlp (>=0.3.0)", "scikit-learn (>=0.20.2)", "scipy (>=1.2.0)", "spacy (>=2.0.18)", "tensorflow (>=1.12.0) ; extra == 'cpu'", "tensorflow-gpu (>=1.0.0) ; extra == 'gpu'" ], "requires_python": "", "summary": "NeuroNER", "version": "1.0.8" }, "last_serial": 5921034, "releases": { "1.0.4": [ { "comment_text": "", "digests": { "md5": "9f250d515f834f2f87fc1d52522dc468", "sha256": "a55e2011e56c64cf88f9dfd65da89e9546763b9ac833c722fdf501ab1838d5ff" }, "downloads": -1, "filename": "pyneuroner-1.0.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "9f250d515f834f2f87fc1d52522dc468", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 26857888, "upload_time": "2019-03-26T22:27:24", "url": "https://files.pythonhosted.org/packages/3e/ad/6c9da75426d19b29f6733aae95c62b32df2ebafb5805ddd12769fa5b5ddf/pyneuroner-1.0.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "951a80ed711ec156c939654ae9915bd2", "sha256": "ee6d8a7137ab0b03aaa24a11f10fb39a049b2e491dd5fe6057664b11090c9783" }, "downloads": -1, "filename": "pyneuroner-1.0.4.tar.gz", "has_sig": false, "md5_digest": "951a80ed711ec156c939654ae9915bd2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26827705, "upload_time": "2019-03-26T22:27:28", "url": "https://files.pythonhosted.org/packages/b6/d2/c1805d82f61ca56948f219177c3d5cdce464051dd5ff434e62d810af5450/pyneuroner-1.0.4.tar.gz" } ], "1.0.5": [ { "comment_text": "", "digests": { "md5": "5d562ffa9a5c7415accc4d8069a90b9e", "sha256": "a2e883cfc3790210b70a97b30519aae4929cad342d9fb83022d2eac3ae326965" }, "downloads": -1, "filename": "pyneuroner-1.0.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "5d562ffa9a5c7415accc4d8069a90b9e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 26858063, "upload_time": "2019-03-29T19:08:07", "url": "https://files.pythonhosted.org/packages/4f/00/f409beedb4e4e130c3d38dbccfe6cbd4303b4ee9707c44044735c31e49e9/pyneuroner-1.0.5-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "94d54378f4bc948d46cb54dc40b001f2", "sha256": "9816d3b86f24803211588a5677a73979e772b0bf36ba12ef5f0f692a4e30c222" }, "downloads": -1, "filename": "pyneuroner-1.0.5.tar.gz", "has_sig": false, "md5_digest": "94d54378f4bc948d46cb54dc40b001f2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26827929, "upload_time": "2019-03-29T19:08:19", "url": "https://files.pythonhosted.org/packages/4b/f4/7c88913f2b197dee3c56b111282d44df78a790602a5ba12607fb56db31e1/pyneuroner-1.0.5.tar.gz" } ], "1.0.6": [ { "comment_text": "", "digests": { "md5": "61779c38ea4e5eb0039a0d26d020905e", "sha256": "3bb6e7290d3431531fedda00503cded1154ee69077118cafb08dfea56a5c4ff3" }, "downloads": -1, "filename": "pyneuroner-1.0.6-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "61779c38ea4e5eb0039a0d26d020905e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 26858008, "upload_time": "2019-03-29T21:09:38", "url": "https://files.pythonhosted.org/packages/7b/39/d5900626eb7bb5771f82f477692ebf822a38d91b1291f199b0123d0cf5c2/pyneuroner-1.0.6-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d43883aca51f42b7bb60c0df83a8c695", "sha256": "8a4a5b2afa5ee2a23b8252dcf7205e94b7052e681cbd6072576f44e3804a43de" }, "downloads": -1, "filename": "pyneuroner-1.0.6.tar.gz", "has_sig": false, "md5_digest": "d43883aca51f42b7bb60c0df83a8c695", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26827869, "upload_time": "2019-03-29T21:09:53", "url": "https://files.pythonhosted.org/packages/96/7a/8adf58f3eaf258aad0bf2a3bc6357a2dac5be98ff23550973f68646245ae/pyneuroner-1.0.6.tar.gz" } ], "1.0.7": [ { "comment_text": "", "digests": { "md5": "013218d0d7c3cb66ded05a5246b610a5", "sha256": "4f959babf8d58ecaf7791880f51d0de1d923f8f58286ee6969ea6e10a9c9d7aa" }, "downloads": -1, "filename": "pyneuroner-1.0.7-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "013218d0d7c3cb66ded05a5246b610a5", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 26858083, "upload_time": "2019-04-30T19:16:15", "url": "https://files.pythonhosted.org/packages/7a/0f/5bbbb0367df8e2a66204e1c1aabb9201653359e28f5412e3af61a927a3e6/pyneuroner-1.0.7-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "44df1a5527293fc9af8df69869e5ee2d", "sha256": "7a8c5b935fac20681dde8728c4c38b55f15a7d4a43da7c6227c4fd92bc2441e8" }, "downloads": -1, "filename": "pyneuroner-1.0.7.tar.gz", "has_sig": false, "md5_digest": "44df1a5527293fc9af8df69869e5ee2d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26828022, "upload_time": "2019-04-30T19:16:21", "url": "https://files.pythonhosted.org/packages/1c/16/2cf9f004cceb84eb5fd655aedae3dc0d8e8484f6aad612018cd27f18d57e/pyneuroner-1.0.7.tar.gz" } ], "1.0.8": [ { "comment_text": "", "digests": { "md5": "e4f781b44cf9b573dd49801da10fcd10", "sha256": "5a4526e81dc309d37e8b81e26986faf63987f8c7de9716b62bc51ce5f0190e6d" }, "downloads": -1, "filename": "pyneuroner-1.0.8-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e4f781b44cf9b573dd49801da10fcd10", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 26858929, "upload_time": "2019-10-02T23:30:15", "url": "https://files.pythonhosted.org/packages/19/cb/7fe87cdfe3f969078edebf3f41520efbc0d1d883e2b05a22b6a068593db1/pyneuroner-1.0.8-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3642202dea9874b255b957df5701b356", "sha256": "e223438793bb756d3a67665a6defc42ad10be4da75b3537813a97d7f26af2769" }, "downloads": -1, "filename": "pyneuroner-1.0.8.tar.gz", "has_sig": false, "md5_digest": "3642202dea9874b255b957df5701b356", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26828106, "upload_time": "2019-10-02T23:30:21", "url": "https://files.pythonhosted.org/packages/86/fc/8e1e42f0147c0d501d7dad9a202fe5a7b3c1639f79dfc6c982e1f7d32851/pyneuroner-1.0.8.tar.gz" } ], "1.0.dev3": [ { "comment_text": "", "digests": { "md5": "89633c189148b86e1cc02d253e45195e", "sha256": "605055d47de2ccb089614168a97ff0709aa05c10129dadc50d7a1a10f2b20a3b" }, "downloads": -1, "filename": "pyneuroner-1.0.dev3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "89633c189148b86e1cc02d253e45195e", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 26858469, "upload_time": "2019-03-13T22:12:54", "url": "https://files.pythonhosted.org/packages/fa/ef/6ba117ef7baebe5fcb2dd0f41c3c026e727cc360a4eeb86a180c06dc737c/pyneuroner-1.0.dev3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9ff9ec2880a9bc9fdf87036c57f24a3e", "sha256": "a3ce5ababee863196502031a62f9d7accb2727b01397a07365319d07075339d9" }, "downloads": -1, "filename": "pyneuroner-1.0.dev3.tar.gz", "has_sig": false, "md5_digest": "9ff9ec2880a9bc9fdf87036c57f24a3e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26827104, "upload_time": "2019-03-13T22:12:59", "url": "https://files.pythonhosted.org/packages/6c/d2/fc215a53ebf2ddb03648e45cfb69072204f5ae6052b854294ad0ec1c9b88/pyneuroner-1.0.dev3.tar.gz" } ], "1.0.dev4": [ { "comment_text": "", "digests": { "md5": "a493a023d2dc05b9611283645e97076c", "sha256": "f5831fb1b12b663dfa576c5e01019c1674f6ef514db4a90d9ee1391aff5d68d6" }, "downloads": -1, "filename": "pyneuroner-1.0.dev4.tar.gz", "has_sig": false, "md5_digest": "a493a023d2dc05b9611283645e97076c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26826602, "upload_time": "2019-03-26T22:27:33", "url": "https://files.pythonhosted.org/packages/8d/e2/faadc7f675e5da62b8819fd131a433fbadad3beae8768d5fa078ec2ce315/pyneuroner-1.0.dev4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e4f781b44cf9b573dd49801da10fcd10", "sha256": "5a4526e81dc309d37e8b81e26986faf63987f8c7de9716b62bc51ce5f0190e6d" }, "downloads": -1, "filename": "pyneuroner-1.0.8-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e4f781b44cf9b573dd49801da10fcd10", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 26858929, "upload_time": "2019-10-02T23:30:15", "url": "https://files.pythonhosted.org/packages/19/cb/7fe87cdfe3f969078edebf3f41520efbc0d1d883e2b05a22b6a068593db1/pyneuroner-1.0.8-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3642202dea9874b255b957df5701b356", "sha256": "e223438793bb756d3a67665a6defc42ad10be4da75b3537813a97d7f26af2769" }, "downloads": -1, "filename": "pyneuroner-1.0.8.tar.gz", "has_sig": false, "md5_digest": "3642202dea9874b255b957df5701b356", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26828106, "upload_time": "2019-10-02T23:30:21", "url": "https://files.pythonhosted.org/packages/86/fc/8e1e42f0147c0d501d7dad9a202fe5a7b3c1639f79dfc6c982e1f7d32851/pyneuroner-1.0.8.tar.gz" } ] }