{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: MacOS", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX :: Linux", "Programming Language :: C", "Programming Language :: Python :: 3.6", "Topic :: Multimedia :: Sound/Audio :: Speech", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "# Pocketsphinx Python\n\nPocketsphinx is a part of the [CMU Sphinx](http://cmusphinx.sourceforge.net) Open Source Toolkit For Speech Recognition.\n\nThis package provides a python interface to CMU [Sphinxbase](https://github.com/cmusphinx/sphinxbase) and [Pocketsphinx](https://github.com/cmusphinx/pocketsphinx) libraries created with [SWIG](http://www.swig.org) and [Setuptools](https://setuptools.readthedocs.io).\n\n## Supported platforms\n\n* Windows (untested)\n* Linux\n* Mac OS X (untested)\n\n### Install requirements\n\nWindows requirements:\n\n* [Python](https://www.python.org/downloads)\n* [Git](http://git-scm.com/downloads)\n* [Swig](http://www.swig.org/download.html)\n* [Visual Studio Community](https://www.visualstudio.com/ru-ru/downloads/download-visual-studio-vs.aspx)\n\nUbuntu requirements:\n\n```shell\nsudo apt-get install -qq python python-dev python-pip build-essential swig git libpulse-dev libasound2-dev\n```\n\nMac OS X requirements:\n\n```shell\nbrew reinstall swig python\n```\n\n## Installation\n\n```shell\n# Make sure we have up-to-date versions of pip, setuptools and wheel\npython -m pip install --upgrade pip setuptools wheel\npip install --upgrade pocketsphinx\n```\n\nMore binary distributions for manual installation are available [here](https://pypi.org/project/pocketsphinx/#files).\n\n### Installing Models\n\nPocketsphinx models in ``.tar.gz`` format can be installed using this packages as well.\n\n```python\nfrom pocketsphinx import PocketsphinxModel, AudioFile\n\nmodels = PocketsphinxModel(model_path='/some/installation/path')\n# this will install the model from the give url under name 'de'\nmodels.install_model('https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/German/cmusphinx-de-voxforge-5.2.tar.gz', 'de')\n\nde = models.get_model('de')\n# this returns a dictionary with the locations of hmm, lm and dict of the model\n\n# we can now use the 'de' model directly with any pocketsphinx object\nfor phrase in LiveSpeech(model=de): print(phrase)\n\n```\n\nThe default ``model_path`` is ``'~/pocketsphinx_models'``.\n\n## Usage\n\n### LiveSpeech\n\nIt's an iterator class for continuous recognition or keyword search from a microphone.\n\n```python\nfrom pocketsphinx import LiveSpeech\nfor phrase in LiveSpeech(): print(phrase)\n```\n\nAn example of a keyword search:\n\n```python\nfrom pocketsphinx import LiveSpeech\n\nspeech = LiveSpeech(lm=False, keyphrase='forward', kws_threshold=1e-20)\nfor phrase in speech:\n print(phrase.segments(detailed=True))\n```\n\nWith your model and dictionary:\n\n```python\nimport os\nfrom pocketsphinx import LiveSpeech, get_model_path\n\nmodel_path = get_model_path()\n\nspeech = LiveSpeech(\n verbose=False,\n sampling_rate=16000,\n buffer_size=2048,\n no_search=False,\n full_utt=False,\n hmm=os.path.join(model_path, 'en-us'),\n lm=os.path.join(model_path, 'en-us.lm.bin'),\n dic=os.path.join(model_path, 'cmudict-en-us.dict')\n)\n\nfor phrase in speech:\n print(phrase)\n```\n\n### StreamSpeech\n\nThis can be used to send chunks of raw bytes to the iterator, usually when transferring audio over a socket or similar.\n\n```python\nfrom pocketsphinx import StreamSpeech\n\nf = open('somefile.wav', 'rb')\n\ndef callback():\n return f.read(2048)\n\nfor phrase in StreamSpeech(callback=callback): print(phrase)\n```\n\nFor an example of keyword search and custom models, see *LiveSpeech*.\n\n### AudioFile\n\nIt's an iterator class for continuous recognition or keyword search from a file.\n\n```python\nfrom pocketsphinx import AudioFile\nfor phrase in AudioFile(): print(phrase) # => \"go forward ten meters\"\n```\n\nAn example of a keyword search:\n\n```python\nfrom pocketsphinx import AudioFile\n\naudio = AudioFile(lm=False, keyphrase='forward', kws_threshold=1e-20)\nfor phrase in audio:\n print(phrase.segments(detailed=True)) # => \"[('forward', -617, 63, 121)]\"\n```\n\nWith your model and dictionary:\n\n```python\nimport os\nfrom pocketsphinx import AudioFile, get_model_path, get_data_path\n\nmodel_path = get_model_path()\ndata_path = get_data_path()\n\nconfig = {\n 'verbose': False,\n 'audio_file': os.path.join(data_path, 'goforward.raw'),\n 'buffer_size': 2048,\n 'no_search': False,\n 'full_utt': False,\n 'hmm': os.path.join(model_path, 'en-us'),\n 'lm': os.path.join(model_path, 'en-us.lm.bin'),\n 'dict': os.path.join(model_path, 'cmudict-en-us.dict')\n}\n\naudio = AudioFile(**config)\nfor phrase in audio:\n print(phrase)\n```\n\nConvert frame into time coordinates:\n\n```python\nfrom pocketsphinx import AudioFile\n\n# Frames per Second\nfps = 100\n\nfor phrase in AudioFile(frate=fps): # frate (default=100)\n print('-' * 28)\n print('| %5s | %3s | %4s |' % ('start', 'end', 'word'))\n print('-' * 28)\n for s in phrase.seg():\n print('| %4ss | %4ss | %8s |' % (s.start_frame / fps, s.end_frame / fps, s.word))\n print('-' * 28)\n\n# ----------------------------\n# | start | end | word |\n# ----------------------------\n# | 0.0s | 0.24s | |\n# | 0.25s | 0.45s | |\n# | 0.46s | 0.63s | go |\n# | 0.64s | 1.16s | forward |\n# | 1.17s | 1.52s | ten |\n# | 1.53s | 2.11s | meters |\n# | 2.12s | 2.6s | |\n# ----------------------------\n```\n\n### Pocketsphinx\n\nIt's a simple and flexible proxy class to `pocketsphinx.Decode`.\n\n```python\nfrom pocketsphinx import Pocketsphinx\nprint(Pocketsphinx().decode()) # => \"go forward ten meters\"\n```\n\nA more comprehensive example:\n\n```python\nfrom __future__ import print_function\nimport os\nfrom pocketsphinx import Pocketsphinx, get_model_path, get_data_path\n\nmodel_path = get_model_path()\ndata_path = get_data_path()\n\nconfig = {\n 'hmm': os.path.join(model_path, 'en-us'),\n 'lm': os.path.join(model_path, 'en-us.lm.bin'),\n 'dict': os.path.join(model_path, 'cmudict-en-us.dict')\n}\n\nps = Pocketsphinx(**config)\nps.decode(\n audio_file=os.path.join(data_path, 'goforward.raw'),\n buffer_size=2048,\n no_search=False,\n full_utt=False\n)\n\nprint(ps.segments()) # => ['', '', 'go', 'forward', 'ten', 'meters', '']\nprint('Detailed segments:', *ps.segments(detailed=True), sep='\\n') # => [\n# word, prob, start_frame, end_frame\n# ('', 0, 0, 24)\n# ('', -3778, 25, 45)\n# ('go', -27, 46, 63)\n# ('forward', -38, 64, 116)\n# ('ten', -14105, 117, 152)\n# ('meters', -2152, 153, 211)\n# ('', 0, 212, 260)\n# ]\n\nprint(ps.hypothesis()) # => go forward ten meters\nprint(ps.probability()) # => -32079\nprint(ps.score()) # => -7066\nprint(ps.confidence()) # => 0.04042641466841839\n\nprint(*ps.best(count=10), sep='\\n') # => [\n# ('go forward ten meters', -28034)\n# ('go for word ten meters', -28570)\n# ('go forward and majors', -28670)\n# ('go forward and meters', -28681)\n# ('go forward and readers', -28685)\n# ('go forward ten readers', -28688)\n# ('go forward ten leaders', -28695)\n# ('go forward can meters', -28695)\n# ('go forward and leaders', -28706)\n# ('go for work ten meters', -28722)\n# ]\n```\n\n### Default config\n\nIf you don't pass any argument while creating an instance of the Pocketsphinx, AudioFile or LiveSpeech class, it will use next default values:\n\n```python\nverbose = False\nlogfn = /dev/null or nul\naudio_file = site-packages/pocketsphinx/data/goforward.raw\naudio_device = None\nsampling_rate = 16000\nbuffer_size = 2048\nno_search = False\nfull_utt = False\nhmm = site-packages/pocketsphinx/model/en-us\nlm = site-packages/pocketsphinx/model/en-us.lm.bin\ndict = site-packages/pocketsphinx/model/cmudict-en-us.dict\n```\n\nAny other option must be passed into the config as is, without using symbol `-`.\n\nIf you want to disable default language model or dictionary, you can change the value of the corresponding options to False:\n\n```python\nlm = False\ndict = False\n```\n\n### Verbose\n\nSend output to stdout:\n\n```python\nfrom pocketsphinx import Pocketsphinx\n\nps = Pocketsphinx(verbose=True)\nps.decode()\n\nprint(ps.hypothesis())\n```\n\nSend output to file:\n\n```python\nfrom pocketsphinx import Pocketsphinx\n\nps = Pocketsphinx(verbose=True, logfn='pocketsphinx.log')\nps.decode()\n\nprint(ps.hypothesis())\n```\n\n### Compatibility\n\nParent classes are still available:\n\n```python\nimport os\nfrom pocketsphinx import DefaultConfig, Decoder, get_model_path, get_data_path\n\nmodel_path = get_model_path()\ndata_path = get_data_path()\n\n# Create a decoder with a certain model\nconfig = DefaultConfig()\nconfig.set_string('-hmm', os.path.join(model_path, 'en-us'))\nconfig.set_string('-lm', os.path.join(model_path, 'en-us.lm.bin'))\nconfig.set_string('-dict', os.path.join(model_path, 'cmudict-en-us.dict'))\ndecoder = Decoder(config)\n\n# Decode streaming data\nbuf = bytearray(1024)\nwith open(os.path.join(data_path, 'goforward.raw'), 'rb') as f:\n decoder.start_utt()\n while f.readinto(buf):\n decoder.process_raw(buf, False, False)\n decoder.end_utt()\nprint('Best hypothesis segments:', [seg.word for seg in decoder.seg()])\n```\n\n## Projects using pocketsphinx-python\n\n* [SpeechRecognition](https://github.com/Uberi/speech_recognition) - Library for performing speech recognition, with support for several engines and APIs, online and offline.\n\n## License\n\n[The BSD License](https://github.com/bambocher/pocketsphinx-python/blob/master/LICENSE)", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "sphinxbase pocketsphinx", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "pocketsphinx-fork", "package_url": "https://pypi.org/project/pocketsphinx-fork/", "platform": "", "project_url": "https://pypi.org/project/pocketsphinx-fork/", "project_urls": null, "release_url": "https://pypi.org/project/pocketsphinx-fork/1.0.0/", "requires_dist": null, "requires_python": "", "summary": "Forked version of [pocketsphinx-python](https://github.com/bambocher/pocketsphinx-python) which adds utility for installing models, and the StreamSpeech interface.", "version": "1.0.0" }, "last_serial": 5385405, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "54bd146d695f25d3bd8f2c563226b505", "sha256": "d1f29c38a2178e188c6cc197e54710378fc60b05025696ed0ab4492fa5bcf6ff" }, "downloads": -1, "filename": "pocketsphinx-fork-1.0.0.tar.gz", "has_sig": false, "md5_digest": "54bd146d695f25d3bd8f2c563226b505", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 29095437, "upload_time": "2019-06-11T08:11:16", "url": "https://files.pythonhosted.org/packages/ad/fa/aa4098478dc488b935bcba0a87ca69ded86ec01f800915ae0f34c6bd8418/pocketsphinx-fork-1.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "54bd146d695f25d3bd8f2c563226b505", "sha256": "d1f29c38a2178e188c6cc197e54710378fc60b05025696ed0ab4492fa5bcf6ff" }, "downloads": -1, "filename": "pocketsphinx-fork-1.0.0.tar.gz", "has_sig": false, "md5_digest": "54bd146d695f25d3bd8f2c563226b505", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 29095437, "upload_time": "2019-06-11T08:11:16", "url": "https://files.pythonhosted.org/packages/ad/fa/aa4098478dc488b935bcba0a87ca69ded86ec01f800915ae0f34c6bd8418/pocketsphinx-fork-1.0.0.tar.gz" } ] }