{ "info": { "author": "Nik Vaessen", "author_email": "nikvaes@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# Word Error Rate for automatic speech recognition\n\nThis repository contains a simple python package to approximate the WER of a transcript. It computes the minimum-edit distance \nbetween the ground-truth sentence and the hypothesis sentence of a speech-to-text API. The minimum-edit distance is calculated\nusing the \n[Wagner-Fisher](https://en.wikipedia.org/wiki/Wagner%E2%80%93Fischer_algorithm) \nalgorithm. Because this algorithm computes the character-level minimum-edit distance, every word in a sentence is assigned a\nunique integer, and the edit-distance is computed over a string of integers. \n\n# Installation\n\nYou should be able to install this package using pip: \n\n```bash\n$ pip install jiwer\n```\n\n# Usage\n\nThe most simple use-case is computing the edit distance between two strings:\n\n```python\nfrom jiwer import wer\n\nground_truth = \"hello world\"\nhypothesis = \"hello duck\"\n\nerror = wer(ground_truth, hypothesis)\n```\n\nYou can also compute the WER over multiple sentences:\n\n```python\nfrom jiwer import wer\n\nground_truth = [\"hello world\", \"i like monthy python\"]\nhypothesis = [\"hello duck\", \"i like python\"]\n\nerror = wer(ground_truth, hypothesis)\n```\n\nWhen the amount of ground-truth sentences and hypothesis sentences differ, a minimum alignment is done over the merged sentence:\n\n```python\nground_truth = [\"hello world\", \"i like monthy python\", \"what do you mean, african or european swallow?\"]\nhypothesis = [\"hello\", \"i like\", \"python\", \"what you mean swallow\"]\n\n# is equivelent to\n\nground_truth = \"hello world i like monhty python what do you mean african or european swallow\"\nhypothesis = \"hello i like python what you mean swallow\"\n```\n\n# Additional preprocessing\n\nSome additional preprocessing can be done on the input. By default, whitespace is removed, everything is set to lower-case,\n`.` and `,` are removed, everything between `[]` and `<>` (common for Kaldi models) is removed and each word is tokenized by \nsplitting by one or more spaces. Additionally, common abbreviations, such as `won't`, `let's`,`n't` will be expanded if \n`standardize=True` is passed along the `wer` method.\n\n```python\nfrom jiwer import wer\n\nground_truth = \"he's my neminis\"\nhypothesis = \"he is my [laughter]\"\n\nwer(ground_truth, hypothesis, standardize=True)\n\n# is equivelent to \n\nground_truth = \"he is my neminis\"\nhypothesis = \"he is my\"\n\nwer(ground_truth, hypothesis)\n```\n\nAlso, there is an option give a list of words to remove from the \ntranscription, such as \"yhe\", or \"so\". \n\n```python\nfrom jiwer import wer\n\nground_truth = \"yhe about that bug\"\nhypothesis = \"yeah about that bug\"\n\nwer(ground_truth, hypothesis, words_to_filter=[\"yhe\", \"yeah\"])\n\n# is equivelent to \n\nground_truth = \"about that bug\"\nhypothesis = \"about that bug\"\n\nwer(ground_truth, hypothesis)\n\n```\n\n\n\n\n\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/jitsi/asr-wer/", "keywords": "", "license": "Apache 2", "maintainer": "", "maintainer_email": "", "name": "jiwer", "package_url": "https://pypi.org/project/jiwer/", "platform": "", "project_url": "https://pypi.org/project/jiwer/", "project_urls": { "Homepage": "https://github.com/jitsi/asr-wer/" }, "release_url": "https://pypi.org/project/jiwer/1.3.2/", "requires_dist": [ "numpy" ], "requires_python": "", "summary": "Approximate the WER of an ASR transcript", "version": "1.3.2" }, "last_serial": 4861842, "releases": { "1.2": [ { "comment_text": "", "digests": { "md5": "2b45de63ec7929e94fb69fe816386963", "sha256": "9925ab37f917535d7c19e372b95e2e672d9d893341501ef40c9ba6082895f941" }, "downloads": -1, "filename": "jiwer-1.2.tar.gz", "has_sig": false, "md5_digest": "2b45de63ec7929e94fb69fe816386963", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5238, "upload_time": "2018-06-19T19:24:01", "url": "https://files.pythonhosted.org/packages/0c/28/8c06f520bd5ed3dca20dd5d707621bd9d3ac0d6e2a61bca21d23fb7ed5b5/jiwer-1.2.tar.gz" } ], "1.3": [ { "comment_text": "", "digests": { "md5": "bbf93fd940d19d3212784ccfaa03e862", "sha256": "552ad30bb294f27b342aec2bdf3747b35122c9f3ba23a61afafd923ab2da6149" }, "downloads": -1, "filename": "jiwer-1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "bbf93fd940d19d3212784ccfaa03e862", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5484, "upload_time": "2018-06-19T20:15:37", "url": "https://files.pythonhosted.org/packages/bc/47/4121ca600ebd6d6c720542b525860f50dba16f5d50d5de157377fa97bbf4/jiwer-1.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f362a49078cf56e6154e111e56888921", "sha256": "4fcb663bc25a65fa8b62b5b9272990c1bc55b21c422d53d52d31ee4d479609d3" }, "downloads": -1, "filename": "jiwer-1.3.tar.gz", "has_sig": false, "md5_digest": "f362a49078cf56e6154e111e56888921", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5240, "upload_time": "2018-06-19T20:15:38", "url": "https://files.pythonhosted.org/packages/e8/c6/f4eb8b7e76e04be1fc271d751636b9943d79100820ba07e61fee3e1340bc/jiwer-1.3.tar.gz" } ], "1.3.1": [ { "comment_text": "", "digests": { "md5": "05fa7fe4aa901da3dcd66ef10c6c8180", "sha256": "5971fc85ce18502230b7e183f716a3828707999e598e5a2517df1b0a5dff78fb" }, "downloads": -1, "filename": "jiwer-1.3.1-py3-none-any.whl", "has_sig": false, "md5_digest": "05fa7fe4aa901da3dcd66ef10c6c8180", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9710, "upload_time": "2018-12-11T13:48:31", "url": "https://files.pythonhosted.org/packages/ce/05/d9fd03f30f710a4c691b03010bca2b4c3e6b1ee543a77d70c8a5ff560106/jiwer-1.3.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "afb696dbe722bcb951a29bc65118787c", "sha256": "d5cfb608b168a032ae071b9f4bc8ffdf653a7e9150d7997940b72a0a6b2ec4f2" }, "downloads": -1, "filename": "jiwer-1.3.1.tar.gz", "has_sig": false, "md5_digest": "afb696dbe722bcb951a29bc65118787c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5318, "upload_time": "2018-12-11T13:48:33", "url": "https://files.pythonhosted.org/packages/8e/1e/198a34b1d1dace818b9627a417cf02d9cf7fc2d4daa275c380dfea0ee728/jiwer-1.3.1.tar.gz" } ], "1.3.2": [ { "comment_text": "", "digests": { "md5": "6daa939928ccc43c108d980f2d37c3e7", "sha256": "245a4a17a3c60373744af7d970d0bef8c5b6e0cc93108c755df12b30d0740be2" }, "downloads": -1, "filename": "jiwer-1.3.2-py3-none-any.whl", "has_sig": false, "md5_digest": "6daa939928ccc43c108d980f2d37c3e7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9713, "upload_time": "2019-02-24T19:41:29", "url": "https://files.pythonhosted.org/packages/0d/fa/87dbadc0f584c49494c72be2d2068de2b42a36f4c93e6aeea6cb1665cadf/jiwer-1.3.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a25789848710a0924451a08c6c4dc5f3", "sha256": "7685d73c3fdc192badac28d004ce33e419c5fb91a2298aab311ca995485529f8" }, "downloads": -1, "filename": "jiwer-1.3.2.tar.gz", "has_sig": false, "md5_digest": "a25789848710a0924451a08c6c4dc5f3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5315, "upload_time": "2019-02-24T19:41:31", "url": "https://files.pythonhosted.org/packages/c7/fd/88639901195f2625941efdf2a1496c540b33901499a986fb271af28e4436/jiwer-1.3.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "6daa939928ccc43c108d980f2d37c3e7", "sha256": "245a4a17a3c60373744af7d970d0bef8c5b6e0cc93108c755df12b30d0740be2" }, "downloads": -1, "filename": "jiwer-1.3.2-py3-none-any.whl", "has_sig": false, "md5_digest": "6daa939928ccc43c108d980f2d37c3e7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9713, "upload_time": "2019-02-24T19:41:29", "url": "https://files.pythonhosted.org/packages/0d/fa/87dbadc0f584c49494c72be2d2068de2b42a36f4c93e6aeea6cb1665cadf/jiwer-1.3.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a25789848710a0924451a08c6c4dc5f3", "sha256": "7685d73c3fdc192badac28d004ce33e419c5fb91a2298aab311ca995485529f8" }, "downloads": -1, "filename": "jiwer-1.3.2.tar.gz", "has_sig": false, "md5_digest": "a25789848710a0924451a08c6c4dc5f3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5315, "upload_time": "2019-02-24T19:41:31", "url": "https://files.pythonhosted.org/packages/c7/fd/88639901195f2625941efdf2a1496c540b33901499a986fb271af28e4436/jiwer-1.3.2.tar.gz" } ] }