{ "info": { "author": "Ryan Epp", "author_email": "hey@ryanepp.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Big Phoney\n\n[![Build Status](https://travis-ci.org/repp/big-phoney.svg?branch=master)](https://travis-ci.org/repp/big-phoney) [![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n\nBig Phoney is a python module that generates phonetic pronunciations from english words.\nFor example, given the word \"dinosaur\", Big Phoney will return \"D AY1 N AH0 S AO2 R\". This is sometimes called\n\"Grapheme-to-Phoneme Conversion\" or G2P. Big Phoney works for any word, even those that don't appear in the dictionary and it's\ndesigned to handle special cases like currency and abbreviations.\n\nPhonetic pronunciations are represented using the [ARPAbet phoneme set](https://en.wikipedia.org/wiki/ARPABET).\n\nBig Phoney can also count the number of syllables in any english word.\n\n### How it Works\n\nWhen possible, pronunciations come from a dictionary which contains 134,000 words. Slang, proper-nouns, and made-up\nwords that don't appear in the standard dictionary are predicted using a model. You can read more about the\n[pronunciation prediction model on Kaggle](https://www.kaggle.com/reppic/predicting-english-pronunciations).\n\nAdditionally, Big Phoney has a number of configurable preprocessors to handle\ncases where proper pronunciation requires special knowledge. For example \"$5.00\" is pronounced \"F AY1 V D AA1 L ER0 Z\".\nCurrently, Big Phoney can handle: numbers, currency, times, symbols, abbreviations, email addresses, and urls.\n\n### Accuracy\n\nFor any of the 134,000 words found in the dictionary, you can expect 100% accurate pronunciations and syllable counts. For words not found\nin the dictionary, you can expect syllable counts to be accurate 98.1% of the time and pronunciations to be perfectly\naccurate 75.4% of the time. Even when a predicted pronunciation isn't completely correct, it's often very close.\n\n## Installation\n**Install with PyPI:**\n```\npip install big-phoney\n```\n**Install from source:**\n```\ngit clone https://github.com/repp/big-phoney.git\ncd big_phoney\npython setup.py install\n```\n## Usage\nFirst, import Big Phoney:\n```python\nfrom big_phoney import BigPhoney\n```\nNext, create an instance of the main class:\n```python\nphoney = BigPhoney()\n```\nThis will load the phonetic dictionary and prediction model into memory. It may take a second. It's in your best interest\nto *not* create multiple instances of this class.\n\nCall `phonize` to generate phonetic spellings from words.\n```python\nphoney.phonize('pterodactyl') # --> 'T EH2 R OW0 D AE1 K T AH0 L'\n\n# Works with multiple words. Individual pronunciations are seperate by 2 spaces:\nphoney.phonize('tyrannosaurus rex') # --> 'T IH0 R AE0 N AH0 S AO1 R AH0 S R EH1 K S'\n```\n\nCall `count_syllables` to get the number of syllables in a word or phrase.\n```python\nphoney.count_syllables('bird') # --> 1\nphoney.count_syllables('triceratops') # --> 4\n\n# Given multiple words, Big Phoney returns the total number of syllables:\nphoney.count_syllables('welcome to jurassic park') # --> 7\n\n# If you want a list of syllable counts, try something like:\n[phoney.count_syllables(word) for word in 'welcome to jurassic park'.split()] # --> [2,1,3,1]\n\n```\n\n## Preprocessors\nBig Phoney has a number of default preprocessors designed to improve pronunciation results in special cases.\n```python\nDEFAULT_PREPROCESSORS = [ExpandCurrencySymbols, FormatEmailAndURLs, ReplaceTimes, SpacePadSymbols,\n SpacePadNumbers, ReplaceAbbreviations, ReplaceNumbers]\n```\nBy default all of the above preprocessors are applied. You can add and remove them when creating a Big Phoney instance\nwith the `preprocessors` keyword argument:\n```python\nphoney = BigPhoney(preprocessors=[ReplaceNumbers]) # Only preprocess numbers\n```\nTo skip preprocessing entirely, just pass an empty list:\n```python\nphoney = BigPhoney(preprocessors=[]) # No preprocessing\n```\nBe careful when adjusting the default preprocessors. Their order is important as some rely on other 'upstream' processors to be most effective.\n\nTo test a preprocessor setup, use the `apply_preprocessors` method:\n```python\nphoney = BigPhoney() # Use default preprocessors\nphoney.apply_preprocessors('\u00a37.89') # --> 'seven pounds and eighty-nine pence'\nphoney.apply_preprocessors('Mt St. Helens') # --> 'mount saint helens'\nphoney.apply_preprocessors('no_reply@gmail.com') # --> 'no underscore reply at gmail dot com'\nphoney.apply_preprocessors('1ft + 2ft = 3ft') # --> 'one foot plus two feet equals three feet'\nphoney.apply_preprocessors(\"It'll be 7:00am in 1,245.6 seconds\") # --> 'it'll be seven o'clock a m in one-thousand, two hundred and forty-five point six seconds'\n```\nWriting your own preprocessors is easy. Any class with a `process` method that inputs and outputs a single string is valid. For example:\n```python\nclass DummyPreprocessor:\n\n def process(self, input_string):\n # do some preprocessing here!\n return input_string\n```\n\n## Other Options\nAs mentioned, Big Phoney uses a dictionary and a model to generate pronunciations, if you only want to use one or\nthe other, you can create instances of each individually:\n\n```python\nfrom big_phoney import PhoneticDictionary\nphonetic_dict = PhoneticDictionary()\nphonetic_dict.lookup('paleontologist') # --> 'P EY2 L IY0 AH0 N T AA1 L AH0 JH IH0 S T' \u2705\nphonetic_dict.lookup('fakeosaur') # --> None \u274c\n```\n\n```python\nfrom big_phoney import PredictionModel\npred_model = PredictionModel()\npred_model.predict('paleontologist') # --> 'P EY2 L IY0 AH0 N T AA1 L AH0 JH IH0 S T' \u2705\nphonetic_dict.lookup('fakeosaur') # --> 'F EY1 K OW0 S AO2 R' \u2705\n```\nThe dictionary is faster and always correct but won't always return a result. The model is slower and less reliably accurate\nbut it will always return *something* no matter what you throw at it. In most cases, you should just stick with\nthe `BigPhoney` class.\n\n## Contributing\nIf you want to contribute to this project that's great! Make sure to check out dev/README.md for more info.\n\n## Acknowledgements\nThe dictionary and data used to train the phonetic prediction model came from the\n[CMU Pronunciation Dictionary](http://www.speech.cs.cmu.edu/cgi-bin/cmudict).\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/repp/big-phoney", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "big-phoney", "package_url": "https://pypi.org/project/big-phoney/", "platform": "", "project_url": "https://pypi.org/project/big-phoney/", "project_urls": { "Homepage": "https://github.com/repp/big-phoney" }, "release_url": "https://pypi.org/project/big-phoney/1.0.1/", "requires_dist": [ "numpy", "keras", "inflect", "h5py", "tensorflow" ], "requires_python": "", "summary": "Get phonetic spellings and syllable counts for any english word. Works with made up and non-dictionary words.", "version": "1.0.1" }, "last_serial": 4021667, "releases": { "0.0.3": [ { "comment_text": "", "digests": { "md5": "5ed379854b8ed066a49fad890caa8f12", "sha256": "a5fa560143a1078dd72e4a4a99910d7466760224866225843aaa8853907c7eb2" }, "downloads": -1, "filename": "big_phoney-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "5ed379854b8ed066a49fad890caa8f12", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11726289, "upload_time": "2018-06-30T03:56:30", "url": "https://files.pythonhosted.org/packages/d3/b6/7e66ac376b311a02fa78085bb894c6476c179ad994c519bf00efb2f06cbf/big_phoney-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "582edd2ecac5fd5d7d8aefb6e6d32a3f", "sha256": "34bf902df7b9f3b35d82963e878303338ab6cb1c93f8ef01004fcd1b35ad4976" }, "downloads": -1, "filename": "big_phoney-0.0.3.tar.gz", "has_sig": false, "md5_digest": "582edd2ecac5fd5d7d8aefb6e6d32a3f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7181, "upload_time": "2018-06-30T03:56:33", "url": "https://files.pythonhosted.org/packages/bf/c5/048179f6863835237ae5aec433ae42363f6cbfc176b61e109a559ce51cd2/big_phoney-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "f4e48b89c786336f35ebcde84a5385a3", "sha256": "ce6e9091d762eabab3c787e43fd5b05f6835e918183dc88b9b70f50afaf83f46" }, "downloads": -1, "filename": "big_phoney-0.0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "f4e48b89c786336f35ebcde84a5385a3", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11732993, "upload_time": "2018-06-30T16:40:16", "url": "https://files.pythonhosted.org/packages/d9/37/dc25c44e02f59626b4998957d4d8fa289f2affb777650de1826988533326/big_phoney-0.0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2ef501346614e3450adb234e89a31fe5", "sha256": "9d8dff964e89643cfb110272e528d44b969b805d2a6c91381dd66b350d41b525" }, "downloads": -1, "filename": "big_phoney-0.0.4.tar.gz", "has_sig": false, "md5_digest": "2ef501346614e3450adb234e89a31fe5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11196, "upload_time": "2018-06-30T16:40:19", "url": "https://files.pythonhosted.org/packages/98/e3/e2fe083fafb5fe81de676740c5d17659c9b62ea8004b43d24b3842f8df47/big_phoney-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "3fbd13999c9d4c25c412e3c116cf6b31", "sha256": "6abc19fd5e6bafb51c0d60963dfa15b634169fca0fd5f54d0c69b0469a43a2ff" }, "downloads": -1, "filename": "big_phoney-0.0.5-py3-none-any.whl", "has_sig": false, "md5_digest": "3fbd13999c9d4c25c412e3c116cf6b31", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11735467, "upload_time": "2018-07-02T04:01:37", "url": "https://files.pythonhosted.org/packages/dc/79/5e044547c40df6c31626c9c18401e094473d830ece116cc028386626b158/big_phoney-0.0.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cddf54c17b68168278775551c4f3cd43", "sha256": "33255fe2e9fc729122e56fee2cafe2f162a9e0fd27d7f3b554906bc27f182c55" }, "downloads": -1, "filename": "big_phoney-0.0.5.tar.gz", "has_sig": false, "md5_digest": "cddf54c17b68168278775551c4f3cd43", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16463, "upload_time": "2018-07-02T04:01:42", "url": "https://files.pythonhosted.org/packages/a9/cc/3e64e67fda6185d8ed360cd52009e56cff67d5470cb0ca043f2e03c70c37/big_phoney-0.0.5.tar.gz" } ], "1.0.0": [ { "comment_text": "", "digests": { "md5": "6451cb0009c047dba53f72fa65af7f60", "sha256": "cd00a08f2883796be92b323aea643cbfb0aa4c51ca3f7938997457f6e794d463" }, "downloads": -1, "filename": "big_phoney-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "6451cb0009c047dba53f72fa65af7f60", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11728643, "upload_time": "2018-07-02T04:49:10", "url": "https://files.pythonhosted.org/packages/06/9f/5d230ec589edf7aa273ad383156ed3127ab5ac02e8f5193ac0f1ecec2d9e/big_phoney-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "de06a541a0d071c0cf4444aa51104946", "sha256": "b09015e71f636d3dde2db1586c65d49a276db3517170e2add043791911e1f4dd" }, "downloads": -1, "filename": "big_phoney-1.0.0.tar.gz", "has_sig": false, "md5_digest": "de06a541a0d071c0cf4444aa51104946", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16489, "upload_time": "2018-07-02T04:49:13", "url": "https://files.pythonhosted.org/packages/b3/1f/119b3ddc80099b1873e4e1f5cbce120d04b11f05bdc05cc37a0269188c42/big_phoney-1.0.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "cb804cca96c60802029056b89fd646cd", "sha256": "29b29ef69b909c75ef65bb5e724785737f80caf3143b6830fcbd591755a19b92" }, "downloads": -1, "filename": "big_phoney-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "cb804cca96c60802029056b89fd646cd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11728752, "upload_time": "2018-07-02T05:30:03", "url": "https://files.pythonhosted.org/packages/1f/ef/85d2640e11feff932c051199294f804fa390fdcb6c745d3e2a0da0d9e8d8/big_phoney-1.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2f138e36858f4ad891304e8403e297b0", "sha256": "bdd3e115ee1e0e75c6208f358fa05fbcd17414441e6eb8c6b96f2a63ec7a9049" }, "downloads": -1, "filename": "big_phoney-1.0.1.tar.gz", "has_sig": false, "md5_digest": "2f138e36858f4ad891304e8403e297b0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16727, "upload_time": "2018-07-02T05:30:14", "url": "https://files.pythonhosted.org/packages/2a/9c/f4074d6052eb3d62d74bb3009d77fe0742ad9e0d6cc0ac1e936b247d979e/big_phoney-1.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "cb804cca96c60802029056b89fd646cd", "sha256": "29b29ef69b909c75ef65bb5e724785737f80caf3143b6830fcbd591755a19b92" }, "downloads": -1, "filename": "big_phoney-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "cb804cca96c60802029056b89fd646cd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11728752, "upload_time": "2018-07-02T05:30:03", "url": "https://files.pythonhosted.org/packages/1f/ef/85d2640e11feff932c051199294f804fa390fdcb6c745d3e2a0da0d9e8d8/big_phoney-1.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2f138e36858f4ad891304e8403e297b0", "sha256": "bdd3e115ee1e0e75c6208f358fa05fbcd17414441e6eb8c6b96f2a63ec7a9049" }, "downloads": -1, "filename": "big_phoney-1.0.1.tar.gz", "has_sig": false, "md5_digest": "2f138e36858f4ad891304e8403e297b0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16727, "upload_time": "2018-07-02T05:30:14", "url": "https://files.pythonhosted.org/packages/2a/9c/f4074d6052eb3d62d74bb3009d77fe0742ad9e0d6cc0ac1e936b247d979e/big_phoney-1.0.1.tar.gz" } ] }