{ "info": { "author": "Alireza Rafiei (aalireza)", "author_email": "mail@rafiei.net", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Operating System :: MacOS :: MacOS X", "Operating System :: POSIX", "Operating System :: Unix", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Internet :: WWW/HTTP :: Indexing/Search", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Utilities" ], "description": "SimpleAudioIndexer\n==================\n\n.. image:: http://rafiei.net/assets/sai/sai_logo.png\n :alt: Simple Audio Indexer: Index audio files and search for a word/phrase or match regex patterns \n :align: center\n\n|build| |license| |docs| |python| |wheel|\n\n\n- `Description <#description>`__\n- `What can it do? <#what-can-it-do>`__\n- `Documentation <#documentation>`__\n- `Requirements <#requirements>`__\n- `Installation <#installation>`__\n- `Uninstallation <#uninstallation>`__\n- `Demo <#demo>`__\n- `Nice to implement in the future <#nice-to-implement-in-the-future>`__\n- `Contributing <#contributing>`__\n- `Authors <#authors>`__\n- `License <#license>`__\n\n\nDescription\n------------\n\nThis is a Python library and command-line tool that helps you search for a word\nor a phrase within an audio file (wav format). It also builts upon the initial\nsearching capability and provides some [so-called] advanced searching abilities!\n\n\nWhat can it do?\n---------------\n\n+ Index audio files (using Watson (Online/Higher-quality) or CMU Pocketsphinx (Offline/Lower-quality)) and save/load the results.\n+ Searching within audio files in multiple languages (default is English)\n+ Define a timing error for your queries to handle discrepencies.\n+ Define constraints on your queries, e.g. whether to include (sub/super)sequences,\n results with missing words etc.\n+ Do full blown regex pattern matching!\n\n\nDocumentation\n-------------\n\nTo read the documentation, visit `here `__.\n\n\nRequirements\n------------\n\n+ Python (v2.7, 3.3, 3.4, 3.5 or 3.6) with pip installed.\n+ Watson API Credentials and/or CMU Pocketsphinx\n+ `sox`\n+ `ffmpeg` (if you choose CMU Pocketsphinx)\n+ `py.text` and `tox` (if you want to run the tests)\n\n\nInstallation\n--------------\nOpen up a terminal and enter:\n\n::\n\n pip install SimpleAudioIndexer\n\n\nInstallation details can be found at the documentations `here `__.\n\nThere's a `dockerfile `_\nincluded withing the repo if you're unable to do a native installation or are\non a Windows system.\n\n\nUninstallation\n--------------\n\nOpen up a terminal and enter:\n\n::\n\n pip uninstall SimpleAudioIndexer\n\nUninstalling `sox`, however, is dependent upon whether you're on a Linux or Mac\nsystem. For more information, visit `here `__.\n\n\nDemo\n----\n\nSay you have this audio file:\n\n|small_audio|\n\n\nHave it downloaded to an empty directory for simplicity. We'd refer to that\ndirectory as `SRC_DIR` and the name of this audio file as `small_audio.wav`.\n\nHere's how you can search through it.\n\nCommand-line Usage\n++++++++++++++++++\n\nOpen up a terminal and enter.\n\n::\n\n $ sai --mode \"ibm\" --username_ibm USERNAME --password_ibm PASSWORD --src_dir SRC_DIR --search \"called\"\n\n {'called': {'small_audio.wav': [(1.25, 1.71)]}}\n\nReplace `USERNAME` and `PASSWORD` with your IBM Watson's credentials and `SRC_DIR`\nwith the absolute path to the directory you just prepared.\n\nThe out would be, like above, a dictionary that has the query, the file(s) it\nappears in and the all of the (starting second, ending second) of that query.\n\nNote that all commands work uniformally for other engines (i.e. Pocketsphinx),\nfor example the command above can be enterred as:\n\n::\n\n $ sai --mode \"cmu\" --src_dir SRC_DIR --search \"lives\"\n\n {'our': {'small_audio': [(2.93, 3.09)]}}\n\nWhich would use Pocketsphinx instead of Watson to get the timestamps. Note that\nthe quality/accuracy of Pocketsphinx is much lower than Watson.\n\nInstead of searching for a word, you could also match a regex pattern, for example:\n\n::\n\n $ sai --mode ibm --src_dir SRC_DIR --username_ibm USERNAME --password_ibm PASSWORD --regexp \" [a-z][a-z] \"\n\n {u' in ': {'small_audio.wav': [(2.81, 2.93)]},\n {u' to ': {'small_audio.wav': [(1.71, 1.81)]}}\n\nThat was the result of searching for two letter words. Note that your results\nwould match any aribtrary regular expressions. \n\nYou may also save and load the indexed data from the command line script. For\nmore information, visit `here `__.\n\n\nLibrary Usage\n+++++++++++++\n\nSay you have this file\n\n.. code-block:: python\n\n >>> from SimpleAudioIndexer import SimpleAudioIndexer as sai\n\nAfterwards, you should create an instance of `sai`\n\n.. code-block:: python\n\n >>> indexer = sai(mode=\"ibm\", src_dir=\"SRC_DIR\", username_ibm=\"USERNAME\", password_ibm=\"PASSWORD\")\n\nNow you may index all the available audio files by calling `index_audio` method:\n\n.. code-block:: python\n\n >>> indexer.index_audio()\n\nYou could have a searching generator:\n\n.. code-block:: python\n\n >>> searcher = indexer.search_gen(query=\"called\")\n # If you're on python 2.7, instead of below, do print searcher.next()\n >>> print(next(searcher))\n {'Query': 'called', 'File Name': 'small_audio.wav', 'Result': (1.25, 1.71)}\n\nNow there are quite a few more arguments implemented for search_gen. Say you\nwanted your search to be case sensitive (by default it's not).\nOr, say you wanted to look for a phrase but there's a timing gap and the indexer\ndidn't pick it up right, you could specify `timing_error`. Or, say some word is\ncompletely missed, then you could specify `missing_word_tolerance` etc.\n\nFor a full list, see the API reference `here <./reference.html\n#SimpleAudioIndexer.SimpleAudioIndexer.search_gen>`__\n\n\nNote that you could also call `search_all` method to have search for a list of\nqueries within all the audio files:\n\nFinally, you could do a regex search!\n\n.. code-block:: python\n\n >>> print(indexer.search_regexp(pattern=\"[A-Z][^l]* \")\n {u'Americans are ca': {'small_audio.wav': [(0.21, 1.71)]}}\n\nThere are more functionalities implemented. For detailed explainations, read the\ndocumentation `here `__.\n\n\nNice to implement in the future\n--------------------------------\n\n- Uploading in parallel\n- More control structures for searching (Typos, phoneme based approximation of\n words using CMU_DICT or NLTK etc.)\n- Searching for an unintelligible audio within the audio files. Possibly by\n cross correlation or something similar.\n\n\nContributing\n-------------\n\nShould you want to contribute code or ideas, file a bug request or give\nfeedback, Visit the `CONTRIBUTING `_ file.\n\nAuthors\n-------\n\n+ **Alireza Rafiei** - `aalireza `_\n\nSee also the list of `contributors `_\nto this project.\n\nLicense\n-------\n\nThis project is licensed under the Apache v2.0 license - see the `LICENCE `_\nfile for more details.\n\n\n.. |license| image:: https://img.shields.io/pypi/l/SimpleAudioIndexer.svg\n :target: LICENSE\n :alt: Apache v2.0 License\n\n.. |docs| image:: https://readthedocs.org/projects/simpleaudioindexer/badge/?version=latest\n :target: http://simpleaudioindexer.readthedocs.io/?badge=latest\n :alt: Documentation Status\n\n.. |build| image:: https://travis-ci.org/aalireza/SimpleAudioIndexer.svg?branch=master\n :target: https://travis-ci.org/aalireza/SimpleAudioIndexer\n :alt: Build status\n\n.. |python| image:: https://img.shields.io/pypi/pyversions/SimpleAudioIndexer.svg\n :alt: Python 2,7, 3,3, 3.4, 3.5, 3.6 supported\n\n.. |wheel| image:: https://img.shields.io/pypi/wheel/SimpleAudioIndexer.svg \n :alt: Wheel ready\n\n.. |small_audio| image:: http://rafiei.net/assets/play_button.png\n :target: http://rafiei.net/assets/sai/small_audio.wav\n :alt: Demo audio file\n\n.. _Documentation: https://github.com/aalireza/SimpleAudioIndexer/docs\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/aalireza/SimpleAudioIndexer/tarball/1.0.0", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/aalireza/SimpleAudioIndexer", "keywords": "audio,indexing,search,ibm,watson,anagram,subsequence,supersequence,sequence,timestamp,cmu,sphinx,cpmsphinx,speech,speech recognition", "license": "Apache v2.0", "maintainer": "", "maintainer_email": "", "name": "SimpleAudioIndexer", "package_url": "https://pypi.org/project/SimpleAudioIndexer/", "platform": "", "project_url": "https://pypi.org/project/SimpleAudioIndexer/", "project_urls": { "Download": "https://github.com/aalireza/SimpleAudioIndexer/tarball/1.0.0", "Homepage": "https://github.com/aalireza/SimpleAudioIndexer" }, "release_url": "https://pypi.org/project/SimpleAudioIndexer/1.0.0/", "requires_dist": [ "requests" ], "requires_python": "", "summary": "Indexes audio files and searches for words/phrases or matches regex patterns within them.", "version": "1.0.0" }, "last_serial": 2636029, "releases": { "0.9.0": [ { "comment_text": "", "digests": { "md5": "aaac2d64621fcb0e4fa99327fa52c73b", "sha256": "64833ee84bccf4fb2ba3ff87e89a649fe571f76a70e96e53d64ef543c0041e76" }, "downloads": -1, "filename": "SimpleAudioIndexer-0.9.0-py2.py3-none-any.whl", "has_sig": true, "md5_digest": "aaac2d64621fcb0e4fa99327fa52c73b", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 25022, "upload_time": "2017-02-04T14:36:33", "url": "https://files.pythonhosted.org/packages/67/d9/8a3813b85ba5d64702af5a99756cd3fdcf9707f4bcabf9c416f9c47071b7/SimpleAudioIndexer-0.9.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "98942e6ef417c2533af579f4ccda43f2", "sha256": "6d9c058f09fd443d375a11237f3f6982aec641e4209e91aa3f17aeeae38ce11e" }, "downloads": -1, "filename": "SimpleAudioIndexer-0.9.0.tar.gz", "has_sig": true, "md5_digest": "98942e6ef417c2533af579f4ccda43f2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 99531, "upload_time": "2017-02-04T14:36:34", "url": "https://files.pythonhosted.org/packages/00/f9/b24f9c33271c9930820116c38b781c700627e2b81153784eeff952377774/SimpleAudioIndexer-0.9.0.tar.gz" } ], "1.0.0": [ { "comment_text": "", "digests": { "md5": "34e025729dd940d3ce0437d058e87326", "sha256": "b5e9d504b2fc95dea3e6f75d9b388e0fdaad2460979fa2730e9832224fd4fe16" }, "downloads": -1, "filename": "SimpleAudioIndexer-1.0.0-py2.py3-none-any.whl", "has_sig": true, "md5_digest": "34e025729dd940d3ce0437d058e87326", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 43867, "upload_time": "2017-02-12T00:02:50", "url": "https://files.pythonhosted.org/packages/25/76/5351d3554d408901a913e6dc239ae420017469d2079196cf43e0bcb70cf0/SimpleAudioIndexer-1.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4041ec8f7a03ad4b00b048cc6b15f211", "sha256": "5c9eae1118c4b33573a1ff88a5114a61cc908b032924d5a134c47ab4546898b2" }, "downloads": -1, "filename": "SimpleAudioIndexer-1.0.0.tar.gz", "has_sig": true, "md5_digest": "4041ec8f7a03ad4b00b048cc6b15f211", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 119783, "upload_time": "2017-02-12T00:02:52", "url": "https://files.pythonhosted.org/packages/4b/8f/4fae47b91a10881c5c4141fefe3553c921c93372d5659c3b269208d8e7df/SimpleAudioIndexer-1.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "34e025729dd940d3ce0437d058e87326", "sha256": "b5e9d504b2fc95dea3e6f75d9b388e0fdaad2460979fa2730e9832224fd4fe16" }, "downloads": -1, "filename": "SimpleAudioIndexer-1.0.0-py2.py3-none-any.whl", "has_sig": true, "md5_digest": "34e025729dd940d3ce0437d058e87326", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 43867, "upload_time": "2017-02-12T00:02:50", "url": "https://files.pythonhosted.org/packages/25/76/5351d3554d408901a913e6dc239ae420017469d2079196cf43e0bcb70cf0/SimpleAudioIndexer-1.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4041ec8f7a03ad4b00b048cc6b15f211", "sha256": "5c9eae1118c4b33573a1ff88a5114a61cc908b032924d5a134c47ab4546898b2" }, "downloads": -1, "filename": "SimpleAudioIndexer-1.0.0.tar.gz", "has_sig": true, "md5_digest": "4041ec8f7a03ad4b00b048cc6b15f211", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 119783, "upload_time": "2017-02-12T00:02:52", "url": "https://files.pythonhosted.org/packages/4b/8f/4fae47b91a10881c5c4141fefe3553c921c93372d5659c3b269208d8e7df/SimpleAudioIndexer-1.0.0.tar.gz" } ] }