{ "info": { "author": "Marc Puig", "author_email": "marc.puig@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "spacy-lookup: Named Entity Recognition based on dictionaries\n************************************************************\n\n`spaCy v2.0 `_ extension and pipeline component\nfor adding Named Entities metadata to ``Doc`` objects. Detects Named Entities\nusing dictionaries. The extension sets the custom ``Doc``,\n``Token`` and ``Span`` attributes ``._.is_entity``, ``._.entity_type``,\n``._.has_entities`` and ``._.entities``.\n\nNamed Entities are matched using the python module ``flashtext``, and\nlooks up in the data provided by different dictionaries.\n\nInstallation\n===============\n\n``spacy-lookup`` requires ``spacy`` v2.0.16 or higher.\n\n.. code:: bash\n\n pip install spacy-lookup\n\nUsage\n=====\nFirst, you need to download a language model.\n\n.. code:: bash\n\n python -m spacy download en\n\nImport the component and initialise it with the shared ``nlp`` object (i.e. an\ninstance of ``Language``), which is used to initialise ``flashtext``\nwith the shared vocab, and create the match patterns. Then add the component\nanywhere in your pipeline.\n\n.. code:: python\n\n import spacy\n from spacy_lookup import Entity\n\n nlp = spacy.load('en')\n entity = Entity(keywords_list=['python', 'product manager', 'java platform'])\n nlp.add_pipe(entity, last=True)\n\n doc = nlp(u\"I am a product manager for a java and python.\")\n assert doc._.has_entities == True\n assert doc[0]._.is_entity == False\n assert doc[3]._.entity_desc == 'product manager'\n assert doc[3]._.is_entity == True\n\n print([(token.text, token._.canonical) for token in doc if token._.is_entity])\n\n\n``spacy-lookup`` only cares about the token text, so you can use it on a blank\n``Language`` instance (it should work for all\n`available languages `_!), or in\na pipeline with a loaded model. If you're loading a model and your pipeline\nincludes a tagger, parser and entity recognizer, make sure to add the entity\ncomponent as ``last=True``, so the spans are merged at the end of the pipeline.\n\nAvailable attributes\n--------------------\n\nThe extension sets attributes on the ``Doc``, ``Span`` and ``Token``. You can\nchange the attribute names on initialisation of the extension. For more details\non custom components and attributes, see the\n`processing pipelines documentation `_.\n\n====================== ======= ===\n``Token._.is_entity`` bool Whether the token is an entity.\n``Token._.entity_type`` unicode A human-readable description of the entity.\n``Doc._.has_entities`` bool Whether the document contains entity.\n``Doc._.entities`` list ``(entity, index, description)`` tuples of the document's entities.\n``Span._.has_entities`` bool Whether the span contains entity.\n``Span._.entities`` list ``(entity, index, description)`` tuples of the span's entities.\n====================== ======= ===\n\nSettings\n--------\n\nOn initialisation of ``Entity``, you can define the following settings:\n\n=============== ============ ===\n``nlp`` ``Language`` The shared ``nlp`` object. Used to initialise the matcher with the shared ``Vocab``, and create ``Doc`` match patterns.\n``attrs`` tuple Attributes to set on the ._ property. Defaults to ``('has_entities', 'is_entity', 'entity_type', 'entity')``.\n``keywords_list`` list Optional lookup table with the list of terms to look for.\n``keywords_dict`` dict Optional lookup table with the list of terms to look for.\n``keywords_file`` string Optional filename with the list of terms to look for.\n=============== ============ ===\n\n.. code:: python\n\n entity = Entity(nlp, keywords_list=['python', 'java platform'], label='ACME')\n nlp.add_pipe(entity)\n doc = nlp(u\"I am a product manager for a java platform and python.\")\n assert doc[3]._.is_entity\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/mpuig/spacy-lookup", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "spacy-lookup", "package_url": "https://pypi.org/project/spacy-lookup/", "platform": "", "project_url": "https://pypi.org/project/spacy-lookup/", "project_urls": { "Homepage": "https://github.com/mpuig/spacy-lookup" }, "release_url": "https://pypi.org/project/spacy-lookup/0.1.0/", "requires_dist": [ "spacy (<3.0.0,>=2.0.16)", "flashtext (>=2.7)" ], "requires_python": "", "summary": "spaCy pipeline component for Named Entity Recognition based on dictionaries.", "version": "0.1.0" }, "last_serial": 4891478, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "a8e933a229644ed6f23aea18361f6743", "sha256": "c84bb0d3e1c2ae3a7b19ccf70133747e2891cb403c0a74770868c478fef33de3" }, "downloads": -1, "filename": "spacy_lookup-0.0.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "a8e933a229644ed6f23aea18361f6743", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7121, "upload_time": "2018-01-15T17:54:07", "url": "https://files.pythonhosted.org/packages/9d/02/65334f32f1dbab2a739de0924f768b695daa97faa56d9b899d0863958d6e/spacy_lookup-0.0.1-py2.py3-none-any.whl" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "f0eb37350e4529af00efc4df6311bd87", "sha256": "420de6cffd213a2ce20809957683d6653916bb6e48e64749d29cc9d823444016" }, "downloads": -1, "filename": "spacy_lookup-0.0.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "f0eb37350e4529af00efc4df6311bd87", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7444, "upload_time": "2018-05-21T10:53:10", "url": "https://files.pythonhosted.org/packages/8f/69/9110a5149dd5bbb9d90c39e9b36cb673611e1cec17c78e985a5269eca888/spacy_lookup-0.0.2-py2.py3-none-any.whl" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "e6454408745f5421bee68a5ebb0c5825", "sha256": "42d580b868dcebfbf170007898140772540c86f535bcd3acd5196b7a954b5346" }, "downloads": -1, "filename": "spacy_lookup-0.0.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e6454408745f5421bee68a5ebb0c5825", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7402, "upload_time": "2018-08-13T11:24:23", "url": "https://files.pythonhosted.org/packages/23/34/664e3d9ea2fac0b9702bdf7ee6c0ca37b07f1838dd77e61d5172ea729a73/spacy_lookup-0.0.3-py2.py3-none-any.whl" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "f7750e6aa097e0a534b9a5c3bdc28581", "sha256": "3318ccd16823cbbb186f6bd85b731e33f714578fdec8d3b1a04b2c59ea25ff65" }, "downloads": -1, "filename": "spacy_lookup-0.0.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "f7750e6aa097e0a534b9a5c3bdc28581", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 6094, "upload_time": "2018-11-17T16:34:24", "url": "https://files.pythonhosted.org/packages/ad/1f/ce98f8dbe0d3d50cb197f2046d236a9c670335d5ec1011a482e2d54da42a/spacy_lookup-0.0.4-py2.py3-none-any.whl" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "3913e87e41ae54404921677486015625", "sha256": "6f80dd9d59615b15436cdb0f64f25e48517aa27007d8f9289ed3f4cc971f1e43" }, "downloads": -1, "filename": "spacy_lookup-0.0.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "3913e87e41ae54404921677486015625", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 6206, "upload_time": "2019-03-03T18:18:31", "url": "https://files.pythonhosted.org/packages/01/0c/223057d0e83e8bbc1a6fb0cc78c32cb6d7f7f1f3b50c4db16d04e542c92c/spacy_lookup-0.0.5-py2.py3-none-any.whl" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "3b672137292827b56cea6e387808640a", "sha256": "b9efd5131d470adbf23ada86fc0e48b9c8874c8e0e171d63c933dcb207fadcc6" }, "downloads": -1, "filename": "spacy_lookup-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "3b672137292827b56cea6e387808640a", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 6205, "upload_time": "2019-03-03T18:18:32", "url": "https://files.pythonhosted.org/packages/d2/a9/7bddf1c5c0a717508cbc3e47fbf0e5c807ca46232bea46765a2619adafe2/spacy_lookup-0.1.0-py2.py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "3b672137292827b56cea6e387808640a", "sha256": "b9efd5131d470adbf23ada86fc0e48b9c8874c8e0e171d63c933dcb207fadcc6" }, "downloads": -1, "filename": "spacy_lookup-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "3b672137292827b56cea6e387808640a", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 6205, "upload_time": "2019-03-03T18:18:32", "url": "https://files.pythonhosted.org/packages/d2/a9/7bddf1c5c0a717508cbc3e47fbf0e5c807ca46232bea46765a2619adafe2/spacy_lookup-0.1.0-py2.py3-none-any.whl" } ] }