{ "info": { "author": "Shahab Sabahi", "author_email": "sabahi.s@mysol-gc.jp", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Science/Research", "Programming Language :: Python", "Programming Language :: Python :: 3.7" ], "description": "Speech Recognition wave2words \n\nThis package is a test sample and contains two functions and acts as a single program: \na pronunciation/rhythm/stress words-phrases game (English). The package could be customized for\na speech-to-text system by accepting input from a microphone or an audio file or both. The \npackage could be structured for any language of choice. \n\nIn this package, we will test our wave2word speech recognition using AI, for English. However,\nin the future releases, other languages will be added to make a language-independent speech\nrecognition. \n\nConcept\n\nWe may guess a word from unknown spoken languages just by listening to the sound of the Speech. \nTo us, in an interaction between a human and a machine, the machine should recognise sounds before\nmaking sense of any words (built from the combination of sounds) meaning. In other words, \nwithout pre-determining the language a speech recognition should pick up sounds and form words. \n\nOur speech processing system focuses on the process of understanding the acoustic features of \nsounds, then building words that are spoken by human beings. The speech signals are captured with \nthe help of a microphone and then they are to be understood by the system.\n\nThe difficulty of speech recognition technology can be broadly characterised along some \ndimensions such as 1)size of a specific language vocabulary pool, 2)speaker dependency-sounds of\na particular word vary from a person to another, 3) the importance of channel quality; human speech\ncontains high bandwidth with full frequency range, while a telephone speech consists of low \nbandwidth with limited frequency range, 4)speaking mode that is whether the speech is in isolated\nword mode, or connected word mode, or in a continuous speech mode. A continuous speech is \nharder to recognize, 5)speaking style; a loud-read speech, spontaneous and conversational, 6) type \nof the noise \u2212 signal to noise ratio may be in various ranges, depending on the acoustic environment\nthat observes less versus more background noise, 7) microphone quality and the distance between mouth \nand microphone. \n\nRecording and Sampling\n\nDuring recording with a microphone, the signals are stored in a digitised form. But to work upon it, \nthe machine needs them in the discrete numeric form. Hence, our algorithm should sample the signals\nat a particular frequency and convert the signal into the discrete numerical form. Choosing the high\nfrequency for sampling implies that when humans listen to the signal, they feel it as a continuous\naudio signal.\n\nTransforming to Frequency Domain\n\nCharacterising an audio signal involves converting the time domain signal into the frequency domain, \nand understanding its frequency components that is an essential step because it gives a lot of information\nabout the signal. You can use a mathematical tool like Fourier Transform to perform this transformation. \nThis transformation is the most critical step in building a speech recogniser because after converting the \nspeech signal into the frequency domain, we must convert it into the usable form of the feature vector.\n We can use different feature extraction techniques like MFCC, PLP, PLP-RASTA etc. for this purpose.\n\nMyvoicerecognition is unique in its aim to provide a complete quantitative and analytical way to study the acoustic\nfeatures of a speech. Moreover, those features could be analysed further by employing Python's functionality\nto provide more fascinating insights into speech patterns. \n\nThis library is for Linguists, scientists, developers, speech and language therapy clinics and researchers. \nPlease note that Myvoicerecognition Analysis is currently in the initial state though in active development. While \nthe amount of functionality that is now present is not huge; more will be added over the next few months.\n\n=============\nInstallation\n=============\nMyvoicerecognition can be installed like any other Python library, using (a recent version of) the Python package \nmanager pip, on Linux, macOS, and Windows:\n\n------------- pip install Myvoicerecognition ------------------------------\nor, to update your installed version to the latest release:\n------------- pip install -u Myvoicerecognition ---------------------------------\n\nNOTE: \n\nYou need to get the following packages installed: \n-----the Microsoft Visual C++ Redistributable for Visual Studio 2017 ------x86 or x64-----see your system\n-----PyAudio---PyAudio>= 0.2.11---pip install PyAudio (win),\n----------------------------------$ sudo apt-get install python-pyaudio python3-pyaudio (Debian-based Linux\n----------------------------------$ brew install portaudio ----$ pip install pyaudio (MaC)\n-----PyAudio-0.2.11-cp37-cp37m-win32.whl or win64.whl -----if your system throws an error for PyAudio \n\nyou may get the third file from \n---------- https://github.com/Shahabks/Myvoicerecognition------\nsave it in a directory and in cmd (command line).../directory/ pip install PyAudio-0.2.11-cp37-cp37m-winxx.whl.\n\nThe package uses the default system microphone. If your system has no default microphone, or you want to use \na microphone other than the default, you will need to specify which one to use by supplying a device index. \n\nTo check how the Myvoicerecognition functions behave, please check \n---------------- EXAMPLES.docx on --------\n------------- https://github.com/Shahabks/Myvoicerecognition-----\n\nMyvoicerecognition was developed by MYOLUTIONS Lab in Japan. It is part of New Generation of Voice\nRecognition and Acoustic & Language modeling Project in MYSOLUTIONS Lab. That is planned to enrich \nthe functionality of Myvoicerecognition by adding more advanced functions.\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Shahabks/Myvoicerecognition", "keywords": "praat speech signal processing phonetics", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "Myvoicerecognition", "package_url": "https://pypi.org/project/Myvoicerecognition/", "platform": "", "project_url": "https://pypi.org/project/Myvoicerecognition/", "project_urls": { "Homepage": "https://github.com/Shahabks/Myvoicerecognition" }, "release_url": "https://pypi.org/project/Myvoicerecognition/1/", "requires_dist": [ "numpy (>=1.15.2)", "SpeechRecognition (>=3.8.1)", "pandas (>=0.23.4)", "scipy (>=1.1.0)" ], "requires_python": "", "summary": "a speech recognition system structured based on an acoustic and a language model", "version": "1" }, "last_serial": 4660776, "releases": { "1": [ { "comment_text": "", "digests": { "md5": "2aed22ff85884bf06b05afb809840e18", "sha256": "0eee45abd766b713b9beafd9e210e8b8ff6d62395e19037fad7af99f7b660fa0" }, "downloads": -1, "filename": "Myvoicerecognition-1-py3-none-any.whl", "has_sig": false, "md5_digest": "2aed22ff85884bf06b05afb809840e18", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 29791, "upload_time": "2019-01-04T16:23:40", "url": "https://files.pythonhosted.org/packages/a3/b5/ae8a12630bf501d52ab1c29d458e1f737965da6f68ef5f45ee381d7ca248/Myvoicerecognition-1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "12a8c55e93be08647890fd1b1ca0d731", "sha256": "2c95ebca096d7fd6e50d85f755e345dd11716620f0602c1c71949bb3fe2cca85" }, "downloads": -1, "filename": "Myvoicerecognition-1.tar.gz", "has_sig": false, "md5_digest": "12a8c55e93be08647890fd1b1ca0d731", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5817, "upload_time": "2019-01-04T16:23:42", "url": "https://files.pythonhosted.org/packages/0a/b6/36fb4782508dcfd0912964edf39f542b5302ccb64cb07dc43a8aac452dfc/Myvoicerecognition-1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "2aed22ff85884bf06b05afb809840e18", "sha256": "0eee45abd766b713b9beafd9e210e8b8ff6d62395e19037fad7af99f7b660fa0" }, "downloads": -1, "filename": "Myvoicerecognition-1-py3-none-any.whl", "has_sig": false, "md5_digest": "2aed22ff85884bf06b05afb809840e18", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 29791, "upload_time": "2019-01-04T16:23:40", "url": "https://files.pythonhosted.org/packages/a3/b5/ae8a12630bf501d52ab1c29d458e1f737965da6f68ef5f45ee381d7ca248/Myvoicerecognition-1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "12a8c55e93be08647890fd1b1ca0d731", "sha256": "2c95ebca096d7fd6e50d85f755e345dd11716620f0602c1c71949bb3fe2cca85" }, "downloads": -1, "filename": "Myvoicerecognition-1.tar.gz", "has_sig": false, "md5_digest": "12a8c55e93be08647890fd1b1ca0d731", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5817, "upload_time": "2019-01-04T16:23:42", "url": "https://files.pythonhosted.org/packages/0a/b6/36fb4782508dcfd0912964edf39f542b5302ccb64cb07dc43a8aac452dfc/Myvoicerecognition-1.tar.gz" } ] }