{ "info": { "author": "Paul Michel", "author_email": "", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Text Processing" ], "description": "\"teapot\"/\n\n# TEAPOT\n\nTEAPOT (**T**ool for **E**valuating **A**dversarial **P**erturbations **O**n **T**ext) is a toolkit to evaluate the effectiveness of adversarial perturbations on NLP systems by taking into account the preservation of meaning in the source.\n\nAdversarial perturbations (perturbations to the input of a model that elicit large changes in the output), have been shown to be an effective way of assessing the robustness of machine models.\nHowever, these perturbations only indicate weaknesses in the model if they do not change the input so significantly that it legitimately result in changes in the expected output. While this is easy to control when the input is real-valued (images for example), the situation is more problematic on discrete data such as natural language.\n\nTEAPOT is an implementation of the evaluation framework described in the NAACL 2019 paper [On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models](https://arxiv.org/abs/1903.06620), wherein an adversarial attack is evaluated using two quantities:\n\n- `s_src(x, x')`: a measure of the semantic similarity between the original input `x` and its adversarial perturbation `x'`.\n- `d_tgt(y(x),y(x'),y*)`: a measure of how much the output similarity `s_tgt` (w.r.t. the reference `y*`) decreases when the model is ran on the adversarial input (giving output `y(x')`) instead of the original input (giving `y(x)`). Specifically `d_tgt(y(x),y(x'),y*)` is defined as `0` if `s_tgt(y,y*) < s_tgt(y',y*)` and `(s_tgt(y,y*) - s_tgt(y',y*)) / s_tgt(y,y*)` otherwise.\n\nAn attack is declared **successful** on `x,y` when `s_src(x,x') + d_tgt(y(x),y(x'),y*) > 1`, in other words, *an adversarial attack changing `x` to `x'` is successful if it destroys the target more than it destroys the source*.\n\nWith TEAPOT, you can compute `s_src`, `d_tgt` and the success rate of an attack easily using proxy metrics for the source and target similarity (`chrF` by default).\n\n\n\n## Getting started\n\n### Installation and Requirements\n\nTEAPOT works with python>=3.6. The only required non-standard dependency for teapot is [sacrebleu](https://github.com/mjpost/sacreBLEU) (a neat tool for computing BLEU and chrF on detokenized text). You can install with `python setup.py install` from the root of the repo, or simply `pip install teapot-nlp` from anywhere you want.\n\n### Basic Usage (sequence-to-sequence)\n\nGiven the original input `examples/MT/src.fr`, adversarial input `examples/MT/adv.charswap.fr`, reference output `examples/MT/ref.en`, original output (output of the model on the original input) `examples/MT/base.en` and adversarial output (output of the model on the adversarial input) `examples/MT/adv.charswap.en`, running:\n\n```bash\nteapot \\\n --src examples/MT/src.fr \\\n --adv-src examples/MT/adv.charswap.fr \\\n --out examples/MT/base.en \\\n --adv-out examples/MT/adv.charswap.en \\\n --ref examples/MT/ref.en\n```\n\nwill output:\n\n```\nSource side preservation (ChrF):\nMean: 86.908\nStd: 11.622\n5%-95%: 64.109-97.683\n--------------------------------------------------------------------------------\nTarget side degradation (ChrF):\nMean: 21.085\nStd: 22.106\n5%-95%: 0.000-67.162\n--------------------------------------------------------------------------------\nSuccess percentage: 65.20 %\n```\n\nAlternatively you can specify only `--src` and `--adv-src` (for source side evaluation) *or* only `--out`,`--adv-out` and `--ref` (for target side evaluation).\n\nYou can learn more about the command line options by running `teapot -h`. Notably you can specify which score to use with `--s-src` and `--s-tgt` (refer to the command line help for the list of scores implemented in your version).\n\n### Basic Usage (other tasks)\n\nWhile the default settings are geared towards evaluating attacks on sequence-to-sequence models, teapot can be used to evaluate attacks on otjer types of NLP models. For example, for a text classification model we might use the zero-one loss as `s_tgt` (`1` if the output label is the same as the reference, `0` otherwise). Here is an example with an attack on a sentiment classification model trained on [SST](https://nlp.stanford.edu/sentiment/):\n\n```bash\nteapot \\\n --src examples/sentiment/src.txt \\\n --adv-src examples/sentiment/adv_src.txt \\\n --out examples/sentiment/out.txt \\\n --adv-out examples/sentiment/adv_out.txt \\\n --ref examples/sentiment/ref.txt \\\n --s-tgt zero_one \\\n --success-threshold 1.8\n```\n\nNotice that we specified `--success-threshold 1.8`, which means that an attack will be considered successful only when `s_src(x,x') + d_tgt(y(x),y(x'),y*) > 1.8`. While using `1` as a threshold makes sense when `s_src` and `s_tgt` are the same, when `s_tgt` is the zero-one loss this is a poor choice, as any attack that flips the label will be successful regardless of `s_src`. By upping the threshold to `1.8` we enforce that `s_src` (here chrF) should be at least `0.8` for an attack to be successful.\n\n## Advanced usage\n\n### Custom scorers\n\nTEAPOT comes with predefined scores to compute the source and target side similarity. However, in some cases you might want to define you own score. Fortunately this can be done in a few steps if you are familiar with python:\n\n1. Write your own `teapot.Scorer` subclass in a python source file (there are examples in [examples/custom_scorers.py]()). This is the hard part.\n2. Call teapot with the arguments `--custom-scores-source path/to/your/scorer.py` and `--score [score_key]` where `[score_key]` is the shorthand you have defined for your scorer with `teapot.register_scorer` (again, see the examples in [examples/custom_scorers.py]() for a walkthrough)\n3. If your scorer works fine, and it doesn't rely on heavy dependencies, consider contributing it to TEAPOT by\n 1. Adding the class to [teapot/scorers.py]()\n 2. Adding a simple unit test to [tests/test_scorers.py]()\n\nHere is an example where `s_tgt` is the `f1` score defined in [examples/custom_scorers.py]():\n\n```bash\nteapot \\\n --src examples/MT/src.fr \\\n --adv-src examples/MT/adv.charswap.fr \\\n --out examples/MT/base.en \\\n --adv-out examples/MT/adv.charswap.en \\\n --ref examples/MT/ref.en \\\n --custom-scores-source examples/custom_scorers.py \\\n --s-tgt f1\n```\n\nOr when `s_src` is the `constant` score (with auxiliary argument `--value`)\n\n```bash\nteapot \\\n --src examples/MT/src.fr \\\n --adv-src examples/MT/adv.charswap.fr \\\n --out examples/MT/base.en \\\n --adv-out examples/MT/adv.charswap.en \\\n --ref examples/MT/ref.en \\\n --custom-scores-source examples/custom_scorers.py \\\n --s-src constant \\\n --value 0.3\n```\n\n\n### METEOR\n\nYou can use the [METEOR](http://www.cs.cmu.edu/~alavie/METEOR/) metric by specifying `--{s-src,s-tgt} meteor`, however this will require you to have java installed and METEOR somewhere on your machine and specify the path to the `.jar` with `--meteor-jar`. This is only tested for METEOR-1.5 on linux.\n\nYou can get METEOR by downloading it from the [website](http://www.cs.cmu.edu/~alavie/METEOR/) or in the command line:\n\n```bash\nwget http://www.cs.cmu.edu/\\\\\\~alavie/METEOR/download/meteor-1.5.tar.gz\n# This will put the jar at ./meteor-1.5/meteor-1.5.jar\ntar xvzf meteor-1.5.tar.gz\n```\n\nTEAPOT will run the jar from python. You can specify the java command to run the jar with `--java-command` (default `java -Xmx2G -jar`)\n\n### Programmatic Usage\n\nHere is an example of how to use TEAPOT in your own code:\n\n```python\nimport teapot\n# Instantiate the scorer of your choice\nchrf_scorer = teapot.ChrF()\nbleu_scorer = teapot.BLEU()\nmeteor_scorer = teapot.METEOR(\"path/to/meteor.jar\", java_command=\"java -Xmx2G -jar\")\n# Compute s_src for example\n# This will return a list of chrf scores\ns_src = chrf_scorer.score(adv_inputs, original_inputs)\n# Compute d_tgt\n# This will return a list of relative difference in scores (clamped to positive values)\nd_tgt = chrf_scorer.rd_score(adv_outputs, original_outputs, reference_outputs)\n```\n\n## Reference-less evaluation\n\nIn the case where no target reference `y*` is available, one can treat as one the output `y(x)=y*`.\nWe can then use:\n\n- `s_src(x, x')`: a measure of the semantic similarity between the original input `x` and its adversarial perturbation `x'`.\n- `s_tgt(y(x), y(x'))`: a measure of the semantic similarity between the respective outputs.\n\nSetting `y* := y(x)` (we are interested in how well the model deviates from its own predictions), the original criterion for success:\n```s_src(x,x') + d_tgt(y(x),y(x'),y*) > 1```\nbecomes\n```s_src(x,x') + 1 - s_tgt(y(x),y(x')) > 1```\nwhich is equivalent to\n```s_src(x,x') / s_tgt(y(x),y(x')) > 1.```\nThe intuition here is slightly different: now, a *successful* attack has caused the system to magnify the source-side adversarial noise.\n\nSimply run the same commands without providing a reference file, for reference-less evaluation.\nFor example:\n\n```\nteapot --src examples/MT/src.fr \\\n --adv-src examples/MT/adv.charswap.fr \\\n --out examples/MT/base.en \\\n --adv-out examples/MT/adv.charswap.en\n```\nwill print\n```\nNo reference file provided. We will use the reference-less criterion.\nSource side preservation (ChrF):\nMean: 86.908\nStd: 11.622\n5%-95%: 64.109-97.683\n--------------------------------------------------------------------------------\nTarget side preservation (ChrF):\nMean: 62.733\nStd: 21.529\n5%-95%: 16.650-90.555\n--------------------------------------------------------------------------------\nSuccess percentage: 97.60 %\n```\n\n## License\n\nThe code is released under the [MIT License](LICENSE). Credits to [@eseniko](https://giphy.com/eseniko) for the image.\n\n\n## Citing\n\nIf you use this software in your own research, consider citing the following paper:\n\n```\n@InProceedings{michel2019onevaluation,\n author = {Michel, Paul and Neubig, Graham and Li, Xian and Pino, Juan Miguel},\n title = {On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models},\n year = {2019},\n booktitle = {Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)}\n}\n```", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/pmichel31415/teapot-nlp", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "teapot-nlp", "package_url": "https://pypi.org/project/teapot-nlp/", "platform": "", "project_url": "https://pypi.org/project/teapot-nlp/", "project_urls": { "Homepage": "https://github.com/pmichel31415/teapot-nlp" }, "release_url": "https://pypi.org/project/teapot-nlp/0.2.1/", "requires_dist": null, "requires_python": "", "summary": "Source and target side evaluation of adversarial attacks on NLP models", "version": "0.2.1" }, "last_serial": 5097604, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "c584b97ef9cb53059d9c9cb6e6e8bfd5", "sha256": "ff0388d8b274517db20f338930b6f0f9d94a697ec2f4768f21a1382631dc0ec6" }, "downloads": -1, "filename": "teapot-nlp-0.1.tar.gz", "has_sig": false, "md5_digest": "c584b97ef9cb53059d9c9cb6e6e8bfd5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12275, "upload_time": "2019-03-13T22:30:47", "url": "https://files.pythonhosted.org/packages/d7/2a/d990a38de3dd88b4148db22e945d88163d4936013389b796f8a0edfa63c3/teapot-nlp-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "ae9acc6868a689462d1c1ed2bda3ce37", "sha256": "f80b34d14359208ee37868925ba2f00728932e4d6ae933283c22b2560cf28340" }, "downloads": -1, "filename": "teapot-nlp-0.2.tar.gz", "has_sig": false, "md5_digest": "ae9acc6868a689462d1c1ed2bda3ce37", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12449, "upload_time": "2019-03-18T22:08:19", "url": "https://files.pythonhosted.org/packages/4c/db/3010a267871293ed7115ff253ba33ed7c1a239a3dd4eb2b1cc4bb14c0b01/teapot-nlp-0.2.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "f8cfeffdc6df6cb15fe4fbbf2f8c7d04", "sha256": "4453c5a007dd4cd6cfa54ef49d34d6a23d1b7969e5be514961f6a3bc68a82714" }, "downloads": -1, "filename": "teapot-nlp-0.2.1.tar.gz", "has_sig": false, "md5_digest": "f8cfeffdc6df6cb15fe4fbbf2f8c7d04", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13599, "upload_time": "2019-04-04T14:20:41", "url": "https://files.pythonhosted.org/packages/d5/e0/49677535a248972396964013aa5eccc4e5855c39e683e18cded36e2dbd73/teapot-nlp-0.2.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f8cfeffdc6df6cb15fe4fbbf2f8c7d04", "sha256": "4453c5a007dd4cd6cfa54ef49d34d6a23d1b7969e5be514961f6a3bc68a82714" }, "downloads": -1, "filename": "teapot-nlp-0.2.1.tar.gz", "has_sig": false, "md5_digest": "f8cfeffdc6df6cb15fe4fbbf2f8c7d04", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13599, "upload_time": "2019-04-04T14:20:41", "url": "https://files.pythonhosted.org/packages/d5/e0/49677535a248972396964013aa5eccc4e5855c39e683e18cded36e2dbd73/teapot-nlp-0.2.1.tar.gz" } ] }