{ "info": { "author": "Benjamin Weigang", "author_email": "Benjamin.Weigang@mailbox.org", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "# rasa_composite_entities\n\nA Rasa NLU component for composite entities, developed to be used in the\nDialogue Engine of [Dialogue Technologies](https://www.dialogue-technologies.com).\n\nSee also [my blog post](https://www.benjaminweigang.com/rasa-nlu-composite-entities/).\n\n**Works with rasa 1.x!**\n\n## Installation\n\n```bash\n$ pip install rasa_composite_entities\n```\n\nThe only external dependency is Rasa NLU itself, which should be installed\nanyway when you want to use this component.\n\nAfter installation, the component can be added your pipeline like any other\ncomponent:\n\n```yaml\nlanguage: \"en_core_web_md\"\n\npipeline:\n- name: \"SpacyNLP\"\n- name: \"SpacyTokenizer\"\n- name: \"SpacyFeaturizer\"\n- name: \"CRFEntityExtractor\"\n- name: \"SklearnIntentClassifier\"\n- name: \"rasa_composite_entities.CompositeEntityExtractor\"\n```\n\n## Usage\n\nSimply add another entry to your training file (in JSON format) defining\ncomposite patterns:\n```json\n\"composite_entities\": [\n {\n \"name\": \"product_with_attributes\",\n \"patterns\": [\n \"@color @product with @pattern\",\n \"@pattern @color @product\"\n ]\n }\n],\n\"common_examples\": [\n ...\n]\n```\nEvery word starting with a \"@\" will be considered a placeholder for an entity\nwith that name. The component is agnostic to the origin of entities, you can\nuse anything that Rasa NLU returns as the \"entity\" field in its messages. This\nmeans that you can not only use the entities defined in your common examples,\nbut also numerical entities from duckling etc.\n\nLonger patterns always take precedence over shorter patterns. If a shorter\npattern matches entities that would also be matched by a longer pattern, the\nshorter pattern is ignored.\n\nPatterns are regular expressions! You can use patterns like\n```\n\"composite_entities\": [\n {\n \"name\": \"product_with_attributes\",\n \"patterns\": [\n \"(?:@pattern\\\\s+)?(?:@color\\\\s+)?@product(?:\\\\s+with @[A-Z,a-z]+)?\"\n ]\n }\n]\n```\nto match different variations of entity combinations. Be aware that you may\nneed to properly escape your regexes to produce valid JSON files (in case of\nthis example, you have to escape the backslashes with another backslash).\n\n## Explanation\n\nComposite entities act as containers that group several entities into logical\nunits. Consider the following example phrase:\n```\nI am looking for a red shirt with stripes and checkered blue shoes.\n```\nProperly trained, Rasa NLU could return entities like this:\n```json\n\"entities\": [\n {\n \"start\": 19,\n \"end\": 22,\n \"value\": \"red\",\n \"entity\": \"color\",\n \"confidence\": 0.9419322376955782,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 23,\n \"end\": 28,\n \"value\": \"shirt\",\n \"entity\": \"product\",\n \"confidence\": 0.9435936216683031,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 34,\n \"end\": 41,\n \"value\": \"stripes\",\n \"entity\": \"pattern\",\n \"confidence\": 0.9233923349716401,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 46,\n \"end\": 55,\n \"value\": \"checkered\",\n \"entity\": \"pattern\",\n \"confidence\": 0.8877627536275875,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 56,\n \"end\": 60,\n \"value\": \"blue\",\n \"entity\": \"color\",\n \"confidence\": 0.6778344517453893,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 61,\n \"end\": 66,\n \"value\": \"shoes\",\n \"entity\": \"product\",\n \"confidence\": 0.536797743231954,\n \"extractor\": \"CRFEntityExtractor\"\n }\n]\n```\n\nIt's hard to infer exactly what the user is looking for from this output alone.\nIs he looking for a striped and checkered shirt? Striped and checkered shoes?\nOr a striped shirt and checkered shoes?\n\nBy defining common patterns of entity combinations, we can automatically create\nentity groups. If we add the composite entity patterns as in the usage example\nabove, the output will be changed to this:\n```json\n\"entities\": [\n {\n \"confidence\": 1.0,\n \"entity\": \"product_with_attributes\",\n \"extractor\": \"composite\",\n \"contained_entities\": [\n {\n \"start\": 19,\n \"end\": 22,\n \"value\": \"red\",\n \"entity\": \"color\",\n \"confidence\": 0.9419322376955782,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 23,\n \"end\": 28,\n \"value\": \"shirt\",\n \"entity\": \"product\",\n \"confidence\": 0.9435936216683031,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 34,\n \"end\": 41,\n \"value\": \"stripes\",\n \"entity\": \"pattern\",\n \"confidence\": 0.9233923349716401,\n \"extractor\": \"CRFEntityExtractor\"\n }\n ]\n },\n {\n \"confidence\": 1.0,\n \"entity\": \"product_with_attributes\",\n \"extractor\": \"composite\",\n \"contained_entities\": [\n {\n \"start\": 46,\n \"end\": 55,\n \"value\": \"checkered\",\n \"entity\": \"pattern\",\n \"confidence\": 0.8877627536275875,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 56,\n \"end\": 60,\n \"value\": \"blue\",\n \"entity\": \"color\",\n \"confidence\": 0.6778344517453893,\n \"extractor\": \"CRFEntityExtractor\"\n },\n {\n \"start\": 61,\n \"end\": 66,\n \"value\": \"shoes\",\n \"entity\": \"product\",\n \"confidence\": 0.536797743231954,\n \"extractor\": \"CRFEntityExtractor\"\n }\n ]\n }\n]\n```\n\n## Example\n\nSee the `example` folder for a minimal example that can be trained and tested.\nTo get the output from above, run:\n```bash\n$ rasa train nlu --out . --nlu train.json --config config_with_composite.yml\n$ rasa run --enable-api --model .\n$ curl -XPOST localhost:5005/model/parse -d '{\"text\": \"I am looking for a red shirt with stripes and checkered blue shoes\"}'\n```\nIf you want to compare this output to the normal Rasa NLU output, use the\nalternative `config_without_composite.yml` config file.\n\nThe component also works when training using the server API:\n\n**HTTP training is currently broken because of API changes in rasa 1.x.\nHopefully, this will soon be fixed!**\n```bash\n$ rasa run --enable-api --model .\n$ curl --request POST --header 'content-type: application/x-yml' --data-binary @train_http.yml --url 'localhost:5000/train?project=test_project'\n$ curl -XPOST localhost:5005/model/parse -d '{\"text\": \"I am looking for a red shirt with stripes and checkered blue shoes\", \"project\": \"test_project\"}'\n```\n\n## Caveats\n\nRasa NLU strips training files of any custom fields, including our\n\"composite_entities\" field. For our component to access this information, we\nhave to circumenvent Rasa's train file loading process and get direct access to\nthe raw data.\n\nWhen training through rasa's train script, the train file paths are fetched\nthrough the command line arguments. When training NLU only, the paths defined\nby the `--nlu` argument are used, otherwise the paths will be taken from the\n`--data` argument.\n\nWhen training through the HTTP server, we exploit the fact that Rasa NLU\ncreates temporary files containing the raw train data. Be aware that this\ncreates a possible race condition when multiple training processes are executed\nsimultaneously. If a new train process is started before the previous process\nhas reached the CompositeEntityExtractor, there is a chance that the wrong\ntrain data will be picked up.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE.md file for\ndetails.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/BeWe11/rasa_composite_entities", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "rasa-composite-entities", "package_url": "https://pypi.org/project/rasa-composite-entities/", "platform": "", "project_url": "https://pypi.org/project/rasa-composite-entities/", "project_urls": { "Homepage": "https://github.com/BeWe11/rasa_composite_entities" }, "release_url": "https://pypi.org/project/rasa-composite-entities/0.4.4/", "requires_dist": null, "requires_python": "", "summary": "A Rasa NLU component for composite entities", "version": "0.4.4" }, "last_serial": 5476214, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "5da360143150c2f14da92f7c962917ce", "sha256": "ab1221fd692c6346d2910e51c5c79881855bab58ff452001737793bbf2239f59" }, "downloads": -1, "filename": "rasa_composite_entities-0.1.0.tar.gz", "has_sig": false, "md5_digest": "5da360143150c2f14da92f7c962917ce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5341, "upload_time": "2019-01-17T16:57:11", "url": "https://files.pythonhosted.org/packages/f7/18/4148ad1759345bfa5deb189084c336baa48ae7b807da6b896fabf74d8db0/rasa_composite_entities-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "d9d7eb9245653a0258e12c3ef5ab6e74", "sha256": "5a3cf45f1c207c929b5670ceeb3602f8ff78befb861fff930015d2331da3991c" }, "downloads": -1, "filename": "rasa_composite_entities-0.2.0.tar.gz", "has_sig": false, "md5_digest": "d9d7eb9245653a0258e12c3ef5ab6e74", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5864, "upload_time": "2019-01-18T16:20:05", "url": "https://files.pythonhosted.org/packages/a1/00/2f18e588f40fa1ccc5f1890174ec718629725a50b5c353e328a2ff879bb8/rasa_composite_entities-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "dcb9133c8b211d2aef7dd85de6747a0a", "sha256": "16a1ac464ad6bcc0ce76f1c844126ab8ce8a96497f96b16be61443163275da8f" }, "downloads": -1, "filename": "rasa_composite_entities-0.2.1.tar.gz", "has_sig": false, "md5_digest": "dcb9133c8b211d2aef7dd85de6747a0a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5846, "upload_time": "2019-01-19T00:56:09", "url": "https://files.pythonhosted.org/packages/1d/58/c2f8d2e6d755f743beb3af72d301dabffbe1292ee8a7cc78189068939d8c/rasa_composite_entities-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "05f1f180b47b8fa5418d5a791615f518", "sha256": "fbb7d4509a92ced1b5e2390b5b46c043e57943ab01f08e2fe41bdc92b07f8944" }, "downloads": -1, "filename": "rasa_composite_entities-0.2.2.tar.gz", "has_sig": false, "md5_digest": "05f1f180b47b8fa5418d5a791615f518", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6487, "upload_time": "2019-01-25T09:49:36", "url": "https://files.pythonhosted.org/packages/9d/52/e5e6359c9ce6d636afe98323949d041ac263f5b48aabc599ef6f174edf7d/rasa_composite_entities-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "0dc0bf57a2c2d7efc5be161e96479211", "sha256": "195e808a3bb7a93bf61728bfecb18d01910d3fe1d3c899de69778758f75c82fd" }, "downloads": -1, "filename": "rasa_composite_entities-0.2.3.tar.gz", "has_sig": false, "md5_digest": "0dc0bf57a2c2d7efc5be161e96479211", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6300, "upload_time": "2019-01-25T11:09:49", "url": "https://files.pythonhosted.org/packages/32/ef/cc4b5933f18447cedfd5833fd55d65996a23c0c92c4eab943286dad811ae/rasa_composite_entities-0.2.3.tar.gz" } ], "0.2.4": [ { "comment_text": "", "digests": { "md5": "5b71c5da21f1d61e36132c09a09b90a0", "sha256": "d35f8b4a811964b9d8ad1aa5423b02361d3808ad35cdcfe7a6df8eb410e65213" }, "downloads": -1, "filename": "rasa_composite_entities-0.2.4.tar.gz", "has_sig": false, "md5_digest": "5b71c5da21f1d61e36132c09a09b90a0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6338, "upload_time": "2019-03-15T11:53:25", "url": "https://files.pythonhosted.org/packages/82/7b/1d7e9552b01bf2d5e526548ba95660161672e3b8acc12b2de4e063acc222/rasa_composite_entities-0.2.4.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "30e53af0a2a33b883871f0de664787a1", "sha256": "a4f861dbf1cfc429b8108498646e54541c06c3e0333c7dbd6a828caaa56c491f" }, "downloads": -1, "filename": "rasa_composite_entities-0.3.0.tar.gz", "has_sig": false, "md5_digest": "30e53af0a2a33b883871f0de664787a1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6263, "upload_time": "2019-05-02T14:52:01", "url": "https://files.pythonhosted.org/packages/9d/64/033e0b058a957bb9ff067a4d32b0b9a29424eb15fbbcc0dcd7b1f3c89782/rasa_composite_entities-0.3.0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "30a8c8f51b71f653736c6fcc4dd7a802", "sha256": "3782ea649644bae453e7f8500b7328751c76fd5e47ca6f43d4c1ffbf02c81d94" }, "downloads": -1, "filename": "rasa_composite_entities-0.4.0.tar.gz", "has_sig": false, "md5_digest": "30a8c8f51b71f653736c6fcc4dd7a802", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6470, "upload_time": "2019-06-04T11:51:50", "url": "https://files.pythonhosted.org/packages/59/86/64de94c45bcf974e77a0900d508acf09a324e70bc933b576af6207e6ae98/rasa_composite_entities-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "10339ab7b20cdfeac3452c8638b2de5c", "sha256": "6d78171e75d722293036e92f3ea85db1d51d609aebaae98a31eb578f6520c477" }, "downloads": -1, "filename": "rasa_composite_entities-0.4.1.tar.gz", "has_sig": false, "md5_digest": "10339ab7b20cdfeac3452c8638b2de5c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6476, "upload_time": "2019-06-05T15:08:29", "url": "https://files.pythonhosted.org/packages/8b/87/f4d6606894cc551f701f042da5527bb50e6a915b59e7d2abaf38749a0e28/rasa_composite_entities-0.4.1.tar.gz" } ], "0.4.2": [ { "comment_text": "", "digests": { "md5": "28578bf111b75b7396b4613612226bd2", "sha256": "9aa5456a22d40c393c4fab2247613319e03dd5b3ee868fff85be95d90ad081d8" }, "downloads": -1, "filename": "rasa_composite_entities-0.4.2.tar.gz", "has_sig": false, "md5_digest": "28578bf111b75b7396b4613612226bd2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7668, "upload_time": "2019-06-07T15:25:12", "url": "https://files.pythonhosted.org/packages/47/ec/f4d8da7131d719348b4005e8b7371d86ce79ff6894ce7d360ff464d59f56/rasa_composite_entities-0.4.2.tar.gz" } ], "0.4.3": [ { "comment_text": "", "digests": { "md5": "9447199dd6b99911b84fa3ec7051ccf1", "sha256": "62afef4be4a453a94bf0f4fccd33e61bfbfa0757ce06228faf2218246782cbcd" }, "downloads": -1, "filename": "rasa_composite_entities-0.4.3.tar.gz", "has_sig": false, "md5_digest": "9447199dd6b99911b84fa3ec7051ccf1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7660, "upload_time": "2019-07-02T10:07:29", "url": "https://files.pythonhosted.org/packages/4c/57/26cfa53922e4f9138f540e3b81dae2a39e40dea1881f48ed7e56c1d4d990/rasa_composite_entities-0.4.3.tar.gz" } ], "0.4.4": [ { "comment_text": "", "digests": { "md5": "8126dc57bc1c8b013371b64c9e6b975e", "sha256": "2565e8f7006928ea26bb2f7e271ec50b57de10aff258195fe2c8a4281365af51" }, "downloads": -1, "filename": "rasa_composite_entities-0.4.4.tar.gz", "has_sig": false, "md5_digest": "8126dc57bc1c8b013371b64c9e6b975e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7487, "upload_time": "2019-07-02T10:16:42", "url": "https://files.pythonhosted.org/packages/06/d1/4793ec70bdc249b22e60151f6b93ccc4220eb6f67406c876355680fb69ea/rasa_composite_entities-0.4.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8126dc57bc1c8b013371b64c9e6b975e", "sha256": "2565e8f7006928ea26bb2f7e271ec50b57de10aff258195fe2c8a4281365af51" }, "downloads": -1, "filename": "rasa_composite_entities-0.4.4.tar.gz", "has_sig": false, "md5_digest": "8126dc57bc1c8b013371b64c9e6b975e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7487, "upload_time": "2019-07-02T10:16:42", "url": "https://files.pythonhosted.org/packages/06/d1/4793ec70bdc249b22e60151f6b93ccc4220eb6f67406c876355680fb69ea/rasa_composite_entities-0.4.4.tar.gz" } ] }