{ "info": { "author": "\u0160ar\u016bnas Navickas", "author_email": "zaibacu@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy" ], "description": "# RITA DSL\n\nThis is a language, loosely based on language [Apache UIMA RUTA](https://uima.apache.org/ruta.html), focused on writing manual language rules, which compiles into [spaCy](https://github.com/explosion/spaCy) compatible patterns. These patterns can be used for doing [manual NER](https://spacy.io/api/entityruler) as well as used in other processes, like retokenizing and pure matching\n\n# Documentation\n\n- [Syntax Guide](docs/syntax.md)\n\n- [Macros](docs/macros.md)\n\n- [Extending](docs/extend.md) - injecting custom macros to be used inside rule generation\n\n# Quick Start\nInstall it via `pip install rita-dsl`\n\nYou can start defining rules by creating file with extention `*.rita`\n\nBellow is complete example which can be used as a reference point\n\n```\ncars = LOAD(\"examples/cars.txt\") # Load items from file\ncolors = {\"red\", \"green\", \"blue\", \"white\", \"black\"} # Declare items inline\n\n{IN_LIST(colors), WORD(\"car\")} -> MARK(\"CAR_COLOR\") # If first token is in list `colors` and second one is word `car`, label it\n\n{IN_LIST(cars), WORD+} -> MARK(\"CAR_MODEL\") # If first token is in list `cars` and follows by 1..N words, label it\n\n{ENTITY(\"PERSON\"), LEMMA(\"like\"), WORD} -> MARK(\"LIKED_ACTION\") # If first token is Person, followed by any word which has lemma `like`, label it\n```\n\nNow you can compile these rules `rita -f .rita output.jsonl`\n\nAnd load into spaCy:\n\n```python\nimport spacy\nfrom spacy.pipeline import EntityRuler\n\nnlp = spacy.load(\"en\")\nruler = EntityRuler(nlp, overwrite_ents=True)\nruler.from_disk(\"output.jsonl\")\nnlp.add_pipe(ruler)\n```\n\nEverytime you'll parse text with spaCy, it will run usual workflow and apply these rules\n\n```python\ntext = \"\"\"\nJohny Silver was driving a red car. It was BMW X6 Mclass. Johny likes driving it very much.\n\"\"\"\n\ndoc = nlp(text)\n\nentities = [(e.text, e.label_) for e in doc.ents]\nprint(entities)\n\nassert entities[0] == (\"Johny Silver\", \"PERSON\") # Normal NER\nassert entities[1] == (\"red car\", \"CAR_COLOR\") # Our first rule\nassert entities[2] == (\"BMW X6 Mclass\", \"CAR_MODEL\") # Our second rule\nassert entities[3] == (\"Johny likes driving\", \"LIKED_ACTION\") # Our third rule\n```\n\nAlternativelly, if `rita` is used as a dependency in project and you prefer to compile rules dynamically, you can do:\n\n```python\nimport rita\nimport spacy\nfrom spacy.pipeline import EntityRuler\n\nnlp = spacy.load(\"en\")\nruler = EntityRuler(nlp, overwrite_ents=True)\n\npatterns = rita.compile(\"examples/color-car.rita\")\n\nruler.add_patterns(patterns)\nnlp.add_pipe(ruler)\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/zaibacu/rita-dsl", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "rita-dsl", "package_url": "https://pypi.org/project/rita-dsl/", "platform": "", "project_url": "https://pypi.org/project/rita-dsl/", "project_urls": { "Homepage": "https://github.com/zaibacu/rita-dsl" }, "release_url": "https://pypi.org/project/rita-dsl/0.1.1/", "requires_dist": [ "ply" ], "requires_python": "", "summary": "DSL for building language rules", "version": "0.1.1" }, "last_serial": 5541912, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "eec1faf1beda1f4c39b099f07f3e424a", "sha256": "6b2baf4f9d5f71d887f34c27a9e50c3c9cf1d168044b3e53c9aba2011b18e724" }, "downloads": -1, "filename": "rita_dsl-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "eec1faf1beda1f4c39b099f07f3e424a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9974, "upload_time": "2019-07-16T17:34:44", "url": "https://files.pythonhosted.org/packages/4c/ef/9963a92286b761492750a7d1b700a00e24189c4843fef09055b61f349c83/rita_dsl-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "eeff4e67cc5cdc5c7a906de8e35623ed", "sha256": "8d1436992b5446a6f54771164f3e38c6fe0a1479ed5bcef0fc44140122dcd618" }, "downloads": -1, "filename": "rita-dsl-0.1.0.tar.gz", "has_sig": false, "md5_digest": "eeff4e67cc5cdc5c7a906de8e35623ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8463, "upload_time": "2019-07-16T17:34:46", "url": "https://files.pythonhosted.org/packages/88/29/157c90f754625c31f164329b80e3ced1134ec7696cd427e81823082bd8f0/rita-dsl-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "5e29cc274bc35d589b7ff78c627bb060", "sha256": "0e1691055c79ba673c0c03263cb7a8c14ce7314459279928021c329a961791d3" }, "downloads": -1, "filename": "rita_dsl-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "5e29cc274bc35d589b7ff78c627bb060", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10602, "upload_time": "2019-07-16T17:43:48", "url": "https://files.pythonhosted.org/packages/9d/eb/b804e450065b7dbaa3913be79e1fe103cd68a1c2644b649411040ea16d67/rita_dsl-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5f0ca3e3f3ad81ea0490f61a7c483422", "sha256": "4524ab0f567e876c4badb8be999171c4ce762840e2853f870e29adf598f60fb1" }, "downloads": -1, "filename": "rita-dsl-0.1.1.tar.gz", "has_sig": false, "md5_digest": "5f0ca3e3f3ad81ea0490f61a7c483422", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9417, "upload_time": "2019-07-16T17:43:50", "url": "https://files.pythonhosted.org/packages/cb/85/489b34004d9ab94b82c59bee621442518f74b99d46b29ac3dd3667bc5105/rita-dsl-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5e29cc274bc35d589b7ff78c627bb060", "sha256": "0e1691055c79ba673c0c03263cb7a8c14ce7314459279928021c329a961791d3" }, "downloads": -1, "filename": "rita_dsl-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "5e29cc274bc35d589b7ff78c627bb060", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10602, "upload_time": "2019-07-16T17:43:48", "url": "https://files.pythonhosted.org/packages/9d/eb/b804e450065b7dbaa3913be79e1fe103cd68a1c2644b649411040ea16d67/rita_dsl-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5f0ca3e3f3ad81ea0490f61a7c483422", "sha256": "4524ab0f567e876c4badb8be999171c4ce762840e2853f870e29adf598f60fb1" }, "downloads": -1, "filename": "rita-dsl-0.1.1.tar.gz", "has_sig": false, "md5_digest": "5f0ca3e3f3ad81ea0490f61a7c483422", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9417, "upload_time": "2019-07-16T17:43:50", "url": "https://files.pythonhosted.org/packages/cb/85/489b34004d9ab94b82c59bee621442518f74b99d46b29ac3dd3667bc5105/rita-dsl-0.1.1.tar.gz" } ] }