{ "info": { "author": "Zeio Nara", "author_email": "zeionara@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Intent and Slot-filling on the ATIS dataset task.\n\n## Model architecture\nFor this task, a model was trained to jointly predict a sentence's Intent and Slots (entities). Each word is embedded (using pre-defined word vectors) to capture the word's meaning while a character-level bidirection Long Short-Term Memory(LSTM) Network encodes the word's letters to capture its lexical structure. \n\n![](intents_slots/architecture.png)\n\nThe word vector and outputs of the character-level bi-LSTM are then fed into the word-level bi-LSTM which predicts the Intent. The second layer feeds into a Conditional Random Fields (CRF) layer to predict the individual slots (entities).\n\nThe model is stored in ```intents_slots/model.py``` \n\n## Word Vectors\n![](word_vectors/poincare.jpg) \nThe newly released Poincare word embeddings (100 dimensional) were used as they have been reported to better encode the hierarchical relationships inherent between words\n![](word_vectors/hierarchical_word_relations.png) \nyou can find the word vectors used in ```word_vectors/poincare.txt```\n\n## Demo\nyou can run a demo of the pre-trained model by running ```intents_slots/demo.py```\n![](intents_slots/results/demo.png)\n![](intents_slots/results/demo2.png)\n\n## Training\nThe model was trained for 50 epochs and stored in the pretrained_models directory \n- ```pretrained_models/dataset_info``` contains all the vocabularies used by the model (character, word, intent, entity) and their mappings to numbers for encoding/decoding\n- ```pretrained_models/model.h5``` are the weights to the model\n\nThe model's loss during training over the epochs are shown below:\n![](intents_slots/results/intent_training_loss.png)\n![](intents_slots/results/entity_training_loss.png)\n\nThe model's accuracy at predicting intents and entities (slots) over time are shown below:\n![](intents_slots/results/intent_training_accuracy.png)\n![](intents_slots/results/entity_training_accuracy.png)\n\nYou can retrain the model by running ```intents_slots/train.py```\n\n## Tests\nJoint:\n![](intents_slots/results/f1_entityintent.png)\n\nIntents only:\n![](intents_slots/results/f1_intents.png)\n\nEntites only:\n![](intents_slots/results/f1_entities.png)\n\nto above results can be obtained by running ```intents_slots/evaluate.py```\n\n## Improvements:\n![](intents_slots/results/normalisation.png)\n\nTo improve the robustness of the model to out of vocab words, the training data was lemmatised prior to training and the model was retrained. Numbers were also masked using a placeholder (e.g. *) to avoid out-of-vocab times appearing (e.g. 9:30 may appear in training but not 9:29). \n\n![](intents_slots/results/before_after.png)\n\nThe results were slightly improved given the above tweaks. Precision, Recall and F1 scores improved across the board (for both intents, entities).\n![](intents_slots/results/training_normalisation.png)\n![](intents_slots/results/f1_intents_normalisation.png)\n![](intents_slots/results/f1_entities_normalisation.png)\n![](intents_slots/results/f1_entityintent_normalisation.png)\n\nPerfect scores were achieved using the validation set!?? ```data/atis-2-dev.csv```\n![](intents_slots/results/f1_validation_intents.png)\n![](intents_slots/results/f1_validation_all.png)\n\n## Future Improvements (TO DO):\n- Balance out training data (its clear that the intent ATIS_FLIGHT dominates the training set) \n\n![](intents_slots/results/intent_support_graph.png)\n\n(and 'O' dominates the entity tags) \n![](intents_slots/results/entities_support_withO_graph.png)\n\n(or if we discount this as an entity tag - then \"to/fromloc.city_name\" tags)\n![](intents_slots/results/entities_support_graph.png)\n\n- e.g. this can be achieved by subsampling or artificially perterbing data to generate more samples (e.g. increase training instance by sliding each sentences one,two,three,etc places)\n\n- Investigate the Intents & Entities which are scoring relatively low F1 scores\n\ne.g. (intents such as ATIS_DAY_NAME, ATIS_MEAL, ATIS_FLIGHT_TIME, etc)\n\n![](intents_slots/results/intent_scores_graph.png)\n\ne.g. (entities such as compartment, booking_class, meal_code, etc)\n\n![](intents_slots/results/entity_scores_graph.png)\n\n- Preprocess intent labels with #?\n- Embed unknown words too (if possible) rather than giving them (1)\n- convert word numbers (e.g. \"one\") into digits\n- improve slot extraction using additional pre-trained Named Entity Recognition (NER)s from various libraries\n\n![](intents_slots/results/pretrained_ners.png)\n![](intents_slots/results/pretrained_ners_type.png)\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/zeionara/slize", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "slize", "package_url": "https://pypi.org/project/slize/", "platform": "", "project_url": "https://pypi.org/project/slize/", "project_urls": { "Homepage": "https://github.com/zeionara/slize" }, "release_url": "https://pypi.org/project/slize/0.0.21/", "requires_dist": null, "requires_python": ">=3.6", "summary": "Reusable Intent and Slot-filling tool", "version": "0.0.21" }, "last_serial": 5837208, "releases": { "0.0.17": [ { "comment_text": "", "digests": { "md5": "22fce5fdce348064688710095c633dab", "sha256": "94d601cc668d2fd5e3e0739af0bbba4ed29faac2a3d86ba79a860bbbb75bc77b" }, "downloads": -1, "filename": "slize-0.0.17-py3-none-any.whl", "has_sig": false, "md5_digest": "22fce5fdce348064688710095c633dab", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 10606, "upload_time": "2019-09-16T15:59:18", "url": "https://files.pythonhosted.org/packages/5d/36/3db92b039dec97832bf5088e1cbe9a590c2afed44bb15d7b48028569aff1/slize-0.0.17-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "7b7bf5b9229eebdbc340451aca6ad6d1", "sha256": "2cc6dd3f8512d164d525e07b29bd603110975bf6c0e3c77b402e2e905116c7cb" }, "downloads": -1, "filename": "slize-0.0.17.tar.gz", "has_sig": false, "md5_digest": "7b7bf5b9229eebdbc340451aca6ad6d1", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 10804, "upload_time": "2019-09-16T15:59:21", "url": "https://files.pythonhosted.org/packages/6d/e9/869d25751cfacaf4d8a44fbe5dcb870cd9b24df14691806ceb775fa1fa30/slize-0.0.17.tar.gz" } ], "0.0.18": [ { "comment_text": "", "digests": { "md5": "174a2565b37e2d1629ec4d1efbd0cfa2", "sha256": "310a443e9c0284b2259f99f2d22e08c4f9232017b99ad8ecd0357425a78af8ad" }, "downloads": -1, "filename": "slize-0.0.18-py3-none-any.whl", "has_sig": false, "md5_digest": "174a2565b37e2d1629ec4d1efbd0cfa2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 10596, "upload_time": "2019-09-16T16:08:04", "url": "https://files.pythonhosted.org/packages/79/93/af6fcac2299f823d7d369f4fbb9d43def9b050080af492414996e211021b/slize-0.0.18-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f3eff8fc93d43e4937c9e2827bf078bd", "sha256": "23f1de15c6ae17e8edcbe178d95aa1684fc1d8a251a0866b57ff8ba574cbb0ab" }, "downloads": -1, "filename": "slize-0.0.18.tar.gz", "has_sig": false, "md5_digest": "f3eff8fc93d43e4937c9e2827bf078bd", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 10800, "upload_time": "2019-09-16T16:08:05", "url": "https://files.pythonhosted.org/packages/3f/9b/a9269a51bd143050d959fcbdc78c6dec19754d9ceb18cce21e34ffd0feff/slize-0.0.18.tar.gz" } ], "0.0.19": [ { "comment_text": "", "digests": { "md5": "9fd70756920f0112b98819d3ab53cb86", "sha256": "2af2bd5b6a95aa0a14fecb8311b5b25f85630695ad3133f3e37e01098eb9f30a" }, "downloads": -1, "filename": "slize-0.0.19-py3-none-any.whl", "has_sig": false, "md5_digest": "9fd70756920f0112b98819d3ab53cb86", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 9823, "upload_time": "2019-09-16T16:14:32", "url": "https://files.pythonhosted.org/packages/17/87/4b11647145e1a75be4ab23a53fc21e7b0e784fbe78286b4a16c05ed8c185/slize-0.0.19-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2271c6fb65c834326ba642e7ac8f8d6f", "sha256": "a0acdc6d9e84f54313f405906c2c369df6c3b41e689db1854f8a267edd615288" }, "downloads": -1, "filename": "slize-0.0.19.tar.gz", "has_sig": false, "md5_digest": "2271c6fb65c834326ba642e7ac8f8d6f", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 6386, "upload_time": "2019-09-16T16:14:34", "url": "https://files.pythonhosted.org/packages/6c/7a/a1b5b7f6fe4ffd7d76d1ad95c3b23f7c4d3f24d2abe8ce6a552fb032d791/slize-0.0.19.tar.gz" } ], "0.0.20": [ { "comment_text": "", "digests": { "md5": "59f923ff348c4e2507a30b6b74ae901e", "sha256": "7d321fecc1cbc812613905db7507e1b0d0a2acd608ebec86d7b72ea635d2f5f3" }, "downloads": -1, "filename": "slize-0.0.20-py3-none-any.whl", "has_sig": false, "md5_digest": "59f923ff348c4e2507a30b6b74ae901e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 10557, "upload_time": "2019-09-16T16:20:27", "url": "https://files.pythonhosted.org/packages/bd/3e/32adf5ff8997c3fd1681fedec35f80e809a3c74bf23b18edba9a69b140e4/slize-0.0.20-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6b1f8208d8b17e90a55311f0c78c28d0", "sha256": "08273ebb19a61ed63a4b4365e4af34554cde0c615810bc9138f0fd85f87ec6ff" }, "downloads": -1, "filename": "slize-0.0.20.tar.gz", "has_sig": false, "md5_digest": "6b1f8208d8b17e90a55311f0c78c28d0", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7135, "upload_time": "2019-09-16T16:20:29", "url": "https://files.pythonhosted.org/packages/c0/d4/43dcfd0bde931e91ac82a7d8675a743c84b5ae6208b6fc2ca505bbfe746f/slize-0.0.20.tar.gz" } ], "0.0.21": [ { "comment_text": "", "digests": { "md5": "f543efa09c165bdfa2d19f6e030baf51", "sha256": "1527c761b7c187fa1a4eeaa7427ce171dfbb72b2956de4e774f8b47f112145a2" }, "downloads": -1, "filename": "slize-0.0.21-py3-none-any.whl", "has_sig": false, "md5_digest": "f543efa09c165bdfa2d19f6e030baf51", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 10557, "upload_time": "2019-09-16T16:26:32", "url": "https://files.pythonhosted.org/packages/92/8e/683d6c6d2981e824c5b423bf10cb2fc5a210be2bf910dc32a315fd23ff2e/slize-0.0.21-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "06f3ed485685ff6175c37acf6f9c51f2", "sha256": "062b7180c3e7e700b525b1626de02fc5622075c919b4b85daa2531cbdbf2c2be" }, "downloads": -1, "filename": "slize-0.0.21.tar.gz", "has_sig": false, "md5_digest": "06f3ed485685ff6175c37acf6f9c51f2", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7133, "upload_time": "2019-09-16T16:26:34", "url": "https://files.pythonhosted.org/packages/81/4e/17dbebe6c538b0107589252d54a32920a05bbd60646326738eae2c997125/slize-0.0.21.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f543efa09c165bdfa2d19f6e030baf51", "sha256": "1527c761b7c187fa1a4eeaa7427ce171dfbb72b2956de4e774f8b47f112145a2" }, "downloads": -1, "filename": "slize-0.0.21-py3-none-any.whl", "has_sig": false, "md5_digest": "f543efa09c165bdfa2d19f6e030baf51", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 10557, "upload_time": "2019-09-16T16:26:32", "url": "https://files.pythonhosted.org/packages/92/8e/683d6c6d2981e824c5b423bf10cb2fc5a210be2bf910dc32a315fd23ff2e/slize-0.0.21-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "06f3ed485685ff6175c37acf6f9c51f2", "sha256": "062b7180c3e7e700b525b1626de02fc5622075c919b4b85daa2531cbdbf2c2be" }, "downloads": -1, "filename": "slize-0.0.21.tar.gz", "has_sig": false, "md5_digest": "06f3ed485685ff6175c37acf6f9c51f2", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7133, "upload_time": "2019-09-16T16:26:34", "url": "https://files.pythonhosted.org/packages/81/4e/17dbebe6c538b0107589252d54a32920a05bbd60646326738eae2c997125/slize-0.0.21.tar.gz" } ] }