{ "info": { "author": "Qordoba", "author_email": "sam.havens@qordoba.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: Other/Proprietary License", "Natural Language :: English", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Text Processing :: Linguistic", "Typing :: Typed" ], "description": "# FitBERT\n\n![buff bert](img/fitbert.png)\n\nFitBert ((F)ill (i)n (t)he blanks, (BERT)) is a library for using [BERT](https://arxiv.org/abs/1810.04805) to fill in the blank(s) in a section of text from a list of options. Here is the envisioned usecase for FitBert:\n\n1. A service (statistical model or something simpler) suggests replacements/corrections for a segment of text\n2. That service is specialized to a domain, and isn't good at the big picture, e.g. grammar\n3. That service passes the segment of text, with the words to be replaced identified, and the list of suggestions\n4. FitBert _crushes_ all but the best suggestion :muscle:\n\n## Installation\n\n## License\n\nThis software is distributed under the Apache 2.0 license, except for the WordNet lemma data used for delemmatization, which is distributed with its original license, which is located in `./fitbert/data/LICENSE`.\n\n## From PyPi\n\n`pip install fitbert`\n\n## Usage\n\n[A Jupyter notebook with a short introduction is available here.](https://colab.research.google.com/drive/1WrYzy9l_arpnTlhCCKViiilPe4WKZJjq)\n\nFitBert will automatically use GPU if `torch.cuda.is_available()`. Which is a good thing, because CPU inference time is really bad:\n\n![cpu inference time chart showing roughly 1 prediction per second](https://imgur.com/3u1U9P8.png)\n\nHere is what GPU inference times are looking like\n\n![gpu inference time chart showing roughly 100x speedup over CPU to approximately 100 per second](https://imgur.com/aEUmJXn.png)\n\n### Usage as a library / in a server\n\n```python\nfrom fitbert import FitBert\n\n\n# in theory you can pass a model_name and tokenizer\n# currently supported models: bert-large-uncased and distilbert-base-uncased\n# this takes a while and loads a whole big BERT into memory\nfb = FitBert()\n\nmasked_string = \"Why Bert, you're looking ***mask*** today!\"\noptions = ['buff', 'handsome', 'strong']\n\nranked_options = fb.rank(masked_string, options=options)\n# >>> ['handsome', 'strong', 'buff']\n# or\nfilled_in = fb.fitb(masked_string, options=options)\n# >>> \"Why Bert, you're looking handsome today!\"\n```\n\nWe commonly find ourselves knowing what verb to suggest, but not what conjugation:\n\n```python\nfrom fitbert import FitBert\n\n\nfb = FitBert()\n\nmasked_string = \"Why Bert, you're ***mask*** handsome today!\"\noptions = ['looks']\n\nfilled_in = fb.fitb(masked_string, options=options)\n# >>> \"Why Bert, you're looking handsome today!\"\n\n# under the hood, we notice there is only one suggestion and act as if\n# fitb was called with delemmatize=True:\nfilled_in = fb.fitb(masked_string, options=options, delemmatize=True)\n```\n\nIf you are already using `pytorch_pretrained_bert.BertForMaskedLM`, or `transformers.BertForMaskedLM` and have an instance of BertForMaskedLM already instantiated, you can pass pass it in to reuse it:\n\n```python\nBLM = pytorch_pretrained_bert.BertForMaskedLM.from_pretrained(model_name)\n# or\nBLM = transfomers.BertForMaskedLM.from_pretrained(model_name)\nfb = FitBert(model=BLM)\n```\n\nYou can also have FitBert mask the string for you\n\n```python\nfrom fitbert import FitBert\n\n\nfb = FitBert()\n\nunmasked_string = \"Why Bert, you're looks handsome today!\"\nspan_to_mask = (17, 22)\nmasked_string, masked = fb.mask(unmasked_string, span_to_mask)\n# >>> \"Why Bert, you're ***mask*** handsome today!\", 'looks'\n\n# you can set options = [masked] or use any List[str]\noptions = [masked]\n\nfilled_in = fb.fitb(masked_string, options=options)\n# >>> \"Why Bert, you're looking handsome today!\"\n```\n\nand there is a convenience method for doing this:\n\n```python\nunmasked_string = \"Why Bert, you're looks handsome today!\"\nspan_to_mask = (17, 22)\n\nfilled_in = fb.mask_fitb(unmasked_string, span_to_mask)\n# >>> \"Why Bert, you're looking handsome today!\"\n```\n\n### Client\n\nIf you are sending strings to a FitBert server, you need to either mask the string yourself, or identify the span you want masked:\n\n```python\nfrom fitbert import FitBert\n\ns = \"This might be justified as a means of signalling the connection between drunken driving and fatal accidents.\"\n\nbetter_string, span_to_change = MyRuleBasedNLPModel.remove_overly_fancy_language(s)\n\nassert better_string == \"This might be justified to signalling the connection between drunken driving and fatal accidents.\", \"Notice 'as a means of' became 'to', but we didn't re-conjuagte signalling, or fix the spelling mistake\"\n\nassert span_to_change == (27, 37), \"This span is the start and stop of the characters for the substring 'signalling'.\"\n\nmasked_string, replaced_substring = FitBert.mask(better_string, span_to_change)\n\nassert masked_string == \"This might be justified to ***mask*** the connection between drunken driving and fatal accidents.\"\n\nassert replaced_substring == \"signalling\"\n\nFitBertServer.fitb(masked_string, options=[replaced_substring])\n```\n\nThe benefit to doing this over masking yourself is that if the internally used masking token changes, you don't have to know about that. Also, you don't need to make an instance of FitBert, so you don't have to incur the cost of downloading a pretrained Bert model.\n\nHowever, you could also write your `CallFitBertServer` function to take an unmasked string and a span, something like:\n\n```python\nFitBertServer.mask_fitb(better_string, span_to_change)\n```\n\nAnd then not need to install `FitBert` in your client at all.\n\n## Development\n\nRun tests with `python -m pytest` or `python -m pytest -m \"not slow\"` to skip the 20 seconds of loading pretrained bert.\n\n### Acknowledgement\n\nI am trying to get in touch with [NodoBird](https://drawception.com/player/546330/nodo-bird/), the artist of the fantastic portrait of Bert depicted above.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Qordobacode/fitbert", "keywords": "", "license": "unlicensed", "maintainer": "", "maintainer_email": "", "name": "fitbert", "package_url": "https://pypi.org/project/fitbert/", "platform": "", "project_url": "https://pypi.org/project/fitbert/", "project_urls": { "Homepage": "https://github.com/Qordobacode/fitbert" }, "release_url": "https://pypi.org/project/fitbert/0.6.0/", "requires_dist": null, "requires_python": ">=3.5", "summary": "Use BERT to Fill in the Blanks", "version": "0.6.0" }, "last_serial": 6001665, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "49101903b5728e0335da0bdc59607c35", "sha256": "97e43041c87a3746fceec955b5a6c397bd3145fc7dd82114093bfe431874c70a" }, "downloads": -1, "filename": "fitbert-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "49101903b5728e0335da0bdc59607c35", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 212232, "upload_time": "2019-06-28T19:57:11", "url": "https://files.pythonhosted.org/packages/11/0f/f9bf2b40ae13f9f16d89a65ea911fd29b2ac11d613dc00db6a37021815bc/fitbert-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5e28e64e1722a18eb8c82e25ae625c98", "sha256": "44cf3549e7a3d926f9d3c92bb1c8ef667effa268ed518dcff284f898f9fdbfe4" }, "downloads": -1, "filename": "fitbert-0.0.1.tar.gz", "has_sig": false, "md5_digest": "5e28e64e1722a18eb8c82e25ae625c98", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 214044, "upload_time": "2019-06-28T19:57:14", "url": "https://files.pythonhosted.org/packages/9d/20/b93265ddd0a6dbaa9ac0a7af70a4ab0eeda147e72884b7ce4a4e95c058b4/fitbert-0.0.1.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "25d44378cbe7319913fa585a332f3cdd", "sha256": "9fffc6d1952887ad0a2d17716d578a6dce3d035317e2c1b7fffa6c99eeb0c667" }, "downloads": -1, "filename": "fitbert-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "25d44378cbe7319913fa585a332f3cdd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 213470, "upload_time": "2019-07-23T23:48:55", "url": "https://files.pythonhosted.org/packages/60/84/fb064de0427eb825b9792ee7fa954714b55beb48ff4a9aa5f1a509a6fe14/fitbert-0.1.0-py3-none-any.whl" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "3d107382f4daf099d5b9f09f07a4f064", "sha256": "7449d546362e277f9c20031f9419bf597cc427cbc16d277d5b15b69af42fa61b" }, "downloads": -1, "filename": "fitbert-0.2.3.tar.gz", "has_sig": false, "md5_digest": "3d107382f4daf099d5b9f09f07a4f064", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 213541, "upload_time": "2019-08-05T16:35:19", "url": "https://files.pythonhosted.org/packages/dc/7c/71f154d26b0376f7f2c0fcb0330dce0cb9adbfb3cf731ec4ffe8d8d583e8/fitbert-0.2.3.tar.gz" } ], "0.2.4": [ { "comment_text": "", "digests": { "md5": "79c551b4dbb50b7115de4120d5acf04c", "sha256": "a3a0569e22affa68e947accbb4ac35c9818f244ce3d8dd9b6f5fe4a4ead7f435" }, "downloads": -1, "filename": "fitbert-0.2.4.tar.gz", "has_sig": false, "md5_digest": "79c551b4dbb50b7115de4120d5acf04c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 213596, "upload_time": "2019-08-09T01:11:20", "url": "https://files.pythonhosted.org/packages/9c/80/e74a0a50400d0f1df9ca368a01a94da4695b638607ce24b41276421c3a01/fitbert-0.2.4.tar.gz" } ], "0.2.5": [ { "comment_text": "", "digests": { "md5": "2f998aa5e1a1c430cb7821b649a401c8", "sha256": "17f9fe95c97dfba5781038dda8f721ce0dabfaafedbe660674f75a9d1736941b" }, "downloads": -1, "filename": "fitbert-0.2.5.tar.gz", "has_sig": false, "md5_digest": "2f998aa5e1a1c430cb7821b649a401c8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 213662, "upload_time": "2019-08-09T18:25:40", "url": "https://files.pythonhosted.org/packages/15/2c/b03b7b2a47f6b88cbfd8aa32ac1acdced0864b471962250efb94d6d5d771/fitbert-0.2.5.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "c3500ca10e13eab73ee6df33161bb1be", "sha256": "00ee674767e8f4e4505b22df6da1ec9456634d57e97787870ff639e7e6f3ac92" }, "downloads": -1, "filename": "fitbert-0.4.0.tar.gz", "has_sig": false, "md5_digest": "c3500ca10e13eab73ee6df33161bb1be", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 216134, "upload_time": "2019-09-11T00:39:28", "url": "https://files.pythonhosted.org/packages/dd/ac/917ce8b19a727f5c855a7c38570322bb43a23908f073788fdd7a861b3168/fitbert-0.4.0.tar.gz" } ], "0.5.0": [ { "comment_text": "", "digests": { "md5": "b18551bff68c0f27a4683cb431b95389", "sha256": "c08bf6367fb23cde7e6543a669ee6585c0fb2601e520ce96befb42350e233e75" }, "downloads": -1, "filename": "fitbert-0.5.0.tar.gz", "has_sig": false, "md5_digest": "b18551bff68c0f27a4683cb431b95389", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 213393, "upload_time": "2019-09-11T18:32:25", "url": "https://files.pythonhosted.org/packages/34/4e/f0ce3ae4b13b9ac905af026d7e48d11b6fd0d2c490bb31ab0484d872023a/fitbert-0.5.0.tar.gz" } ], "0.6.0": [ { "comment_text": "", "digests": { "md5": "70ec249f4802af5acabc4097a73e40a9", "sha256": "46230d09a9f0743c18694ac5cfc376ed3c95920a386c58e4b87f0063950a5f38" }, "downloads": -1, "filename": "fitbert-0.6.0.tar.gz", "has_sig": false, "md5_digest": "70ec249f4802af5acabc4097a73e40a9", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 215904, "upload_time": "2019-10-20T02:21:42", "url": "https://files.pythonhosted.org/packages/2a/1b/d5576574635a5ab10baafda0b09db79799515aa9b011a2bb1f7a3992d95b/fitbert-0.6.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "70ec249f4802af5acabc4097a73e40a9", "sha256": "46230d09a9f0743c18694ac5cfc376ed3c95920a386c58e4b87f0063950a5f38" }, "downloads": -1, "filename": "fitbert-0.6.0.tar.gz", "has_sig": false, "md5_digest": "70ec249f4802af5acabc4097a73e40a9", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 215904, "upload_time": "2019-10-20T02:21:42", "url": "https://files.pythonhosted.org/packages/2a/1b/d5576574635a5ab10baafda0b09db79799515aa9b011a2bb1f7a3992d95b/fitbert-0.6.0.tar.gz" } ] }