{ "info": { "author": "Tongshuang Wu", "author_email": "wtshuang@cs.washington.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Natural Language :: English", "Operating System :: MacOS", "Operating System :: Microsoft :: Windows", "Operating System :: Unix", "Programming Language :: Python", "Programming Language :: Python :: 3.6" ], "description": "# Errudite\n\n**This opensourcing is work-in-progress!**\n\nErrudite is an interactive tool for scalable, reproducible, and counterfactual error analysis. \nErrudite provides an expressive domain-specific language for extracting relevant features of\nlinguistic data, which allows users to visualize data attributes, group relevant instances,\nand perform counterfactual analysis across all available validation data. \n\n\n## Getting Started\n\n1. Watch [this video demo](https://youtu.be/Dil5i0AYyu8) that contains the highlights of Errudite's functions & use cases \n2. Get [set up](#installation) quickly\n3. Try [Errudite's user interface](#gui-server) on machine comprehension\n4. Try the [tutorials on JupyterLab notebooks](#jupyterLab-tutorial)\n5. Read the [documentation](https://errudite.readthedocs.io/en/latest/)\n\n## Citation\nIf you are interested in this work, please see our \n[ACL 2019 research paper](https://homes.cs.washington.edu/~wtshuang/files/acl2019_errudite.pdf)\nand consider citing our work:\n```\n@inproceedings{2019-errudite,\n title = {Errudite: Scalable, Reproducible, and Testable Error Analysis},\n author = {Wu, Tongshuang and Ribeiro, Marco Tulio and Heer, Jeffrey and Weld Daniel S.},\n booktitle={the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019)},\n year = {2019},\n url = {https://homes.cs.washington.edu/~wtshuang/files/acl2019_errudite.pdf},\n}\n```\n\n## Quick Start\n\n### Installation\n\n#### PIP\nErrudite requires Python 3.6.x. The package is avaiable through `pip`: \nJust install it in your Python environment and you're good to go!\n\n```SH\n# create the virtual environment\nvirtualenv --no-site-packages -p python3.6 venv\n# activate venv\nsource venv/bin/activate\n# install errudite\npip install errudite\n```\n\n#### Install from source\n\nYou can also install Errudite by cloning our git repository:\n\n```sh\ngit clone https://github.com/uwdata/errudite\n```\n\nCreate a Python 3.6 virtual environment, and install Errudite in `editable` mode by running:\n\n```sh\npip install --editable .\n```\n\nThis will make `errudite` available on your system but it will use the sources from the local clone\nyou made of the source repository.\n\n### GUI Server\n\nErrudite has a UI wrapped for Machine Comprehension and Visual Question Answering tasks.\nThe interface integrates all the key analysis functions (e.g., inspecting instance attributes,\ngrouping similar instances, rewriting instances), It also provides exploration \nsupport such as visualizing data distributions, suggesting potential queries, and presenting the \ngrouping and rewriting results. While not strictly necessary, it makes their application much \nmore straightforward.\n\nTo get a taste of GUI for the machine comprehension task, you should first download a cache folder \nfor preprocessed [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) instances, which will help you\nskip the process of running your own preprocessing:\n\n```\npython -m errudite.download\n\nCommands:\n cache_folder_name\n A folder name. Currently, we allow downloading the following:\n squad-100, squad-10570.\n cache_path A local path where you want to save the cache folder to.\n```\n\nThen, we need to start the server: \n\n```sh\n# the model relies on Allennlp, so make sure you install that first.\npip install allennlp==0.8.4\nsource venv/bin/activate\npython -m errudite.server\n\nCommands:\n config_file\n A yaml config file path.\n```\nThe config file looks like the following (or in [config.yml](config.yml)):\n\n```yml\ntask: qa # the task, should be \"qa\" and \"vqa\".\ncache_path: {cache_path}/{cache_folder_name}/ # the cached folder.\nmodel_metas: # a model.\n- name: bidaf\n model_class: bidaf # an implemented model class\n model_path: # a local model file path\n # an online path to an Allennlp model\n model_online_path: https://s3-us-west-2.amazonaws.com/allennlp/models/bidaf-model-2017.09.15-charpad.tar.gz\n description: Pretrained model from Allennlp, for the BiDAF model (QA)\nattr_file_name: null # It set, to load previously saved analysis.\ngroup_file_name: null\nrewrite_file_name: null\n```\n\nThen visit `http://localhost:5000/` in your web browser.\n\n\n### JupyterLab Tutorial\n\nBesides used in a GUI, errudite also serves as a general python package. The tutorial goes\nthrough:\n1. Preprocessing the data, and extending Errudite to different tasks & predictors\n2. Creating data attributes and data groups with a domain specific language (or your customized functions).\n3. Creating rewrite rules with the domain specific language (or your customized functions).\n\nTo go through the tutorial, do the following steps:\n\n```sh\n# clone the repo\ngit clone https://github.com/uwdata/errudite\n# initial folder: errudite/\n# create the virtual environment\nvirtualenv --no-site-packages -p python3.6 venv\n# activate venv\nsource venv/bin/activate\n\n# run the default setup script\npip install --editable .\n\n# get to the tutorial folder, and start!\ncd tutorials\npip install -r requirements_tutorials.txt\njupyter lab\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/tongshuangwu/", "keywords": "", "license": "BSD 3-Clause License", "maintainer": "Tongshuang Wu", "maintainer_email": "wtshuang@cs.washington.edu", "name": "errudite", "package_url": "https://pypi.org/project/errudite/", "platform": "Windows", "project_url": "https://pypi.org/project/errudite/", "project_urls": { "Homepage": "https://github.com/tongshuangwu/" }, "release_url": "https://pypi.org/project/errudite/0.0.3/", "requires_dist": [ "altair (==3.0.0)", "pandas (==0.24.2)", "overrides (==1.9)", "tqdm (==4.31.1)", "dill (==0.2.9)", "spacy (==2.1.3)", "pattern (==3.6.0)", "nltk (==3.4)", "pyyaml (==5.1.1)", "flask (==1.0.3)", "flask-cors (==3.0.8)" ], "requires_python": "", "summary": "NLP error analysis.", "version": "0.0.3" }, "last_serial": 5469818, "releases": { "0.0.2": [ { "comment_text": "", "digests": { "md5": "e41ec1bac40a2d6bac669d0184978b03", "sha256": "e0cfa5b72ec67e6c716a3e724e5349cba3d5655956443966584a75bfefe877d9" }, "downloads": -1, "filename": "errudite-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "e41ec1bac40a2d6bac669d0184978b03", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 172218, "upload_time": "2019-07-01T06:16:39", "url": "https://files.pythonhosted.org/packages/ff/73/7f6e28a5ad684ef973384e913c9d702cfc9e3cd8642cee1b7df9da84f744/errudite-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "507266ed5063b41188ec2637eb632a78", "sha256": "6d1c4921eb1cc74d530375481cc2cf04d6cabfb5ac054bac73652be1c937c43f" }, "downloads": -1, "filename": "errudite-0.0.2.tar.gz", "has_sig": false, "md5_digest": "507266ed5063b41188ec2637eb632a78", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 125607, "upload_time": "2019-07-01T06:16:42", "url": "https://files.pythonhosted.org/packages/a4/ff/55091d165f86c17abb2b579d3a2415e8d691da9fa819647792d2c935f468/errudite-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "792a3553a8d2718f8a803d5afe355fef", "sha256": "8a4d5274ce01bf35633f21c3a71831d4d39539de62ddb0942c5da461192a6705" }, "downloads": -1, "filename": "errudite-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "792a3553a8d2718f8a803d5afe355fef", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3726720, "upload_time": "2019-07-01T06:41:52", "url": "https://files.pythonhosted.org/packages/38/bb/f3277869abb80aa5ae258e210bf7454526872f9739a73bdab35249468c2a/errudite-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3ef6fe60bf2df9fdecc5cceebf3bfde5", "sha256": "48c81f2df6af0f33535278dfe7cc854b3d3cc2ee2c96fab963e363586262be1c" }, "downloads": -1, "filename": "errudite-0.0.3.tar.gz", "has_sig": false, "md5_digest": "3ef6fe60bf2df9fdecc5cceebf3bfde5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 127340, "upload_time": "2019-07-01T06:41:55", "url": "https://files.pythonhosted.org/packages/51/41/73279ed5a3e9eadaa11f536b22ca23a621f5cdd0ce2813b979ce1024a173/errudite-0.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "792a3553a8d2718f8a803d5afe355fef", "sha256": "8a4d5274ce01bf35633f21c3a71831d4d39539de62ddb0942c5da461192a6705" }, "downloads": -1, "filename": "errudite-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "792a3553a8d2718f8a803d5afe355fef", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3726720, "upload_time": "2019-07-01T06:41:52", "url": "https://files.pythonhosted.org/packages/38/bb/f3277869abb80aa5ae258e210bf7454526872f9739a73bdab35249468c2a/errudite-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3ef6fe60bf2df9fdecc5cceebf3bfde5", "sha256": "48c81f2df6af0f33535278dfe7cc854b3d3cc2ee2c96fab963e363586262be1c" }, "downloads": -1, "filename": "errudite-0.0.3.tar.gz", "has_sig": false, "md5_digest": "3ef6fe60bf2df9fdecc5cceebf3bfde5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 127340, "upload_time": "2019-07-01T06:41:55", "url": "https://files.pythonhosted.org/packages/51/41/73279ed5a3e9eadaa11f536b22ca23a621f5cdd0ce2813b979ce1024a173/errudite-0.0.3.tar.gz" } ] }