{ "info": { "author": "Michael de Gans", "author_email": "", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: End Users/Desktop", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Topic :: Multimedia :: Sound/Audio :: Analysis", "Topic :: Multimedia :: Sound/Audio :: Speech", "Topic :: Software Development :: Internationalization", "Topic :: Text Processing :: Linguistic" ], "description": "# Substream #\n\nTranscribes an audio file to .srt subtitle format using word timings from\nGoogle's Speech-to-Text API.\n\n## Requirements: ##\n\n* A Google account, [signed up for cloud](https://cloud.google.com/).\n\n## Installing: ##\n\n```shell\npip install substream\n```\n\nCloud setup:\n\n* [Create a *new* service account](https://cloud.google.com/iam/docs/creating-managing-service-accounts)\n for a ***new*** project dedicated to your recognition job. It must have the \n following permissions:\n \n * _Cloud Speech Service Agent_\n * _Storage Admin_ OR\n * _Storage Object Viewer_ if supplying a `gs://` URI to the script.\n \n You can set the location to the .json credentials file you downloaded in the\n current environment like:\n \n ```shell\n export GOOGLE_APPLICATION_CREDENTIALS=/path/to/cloud_credentials.json\n ```\n \n __(OR)__ you can set it just before the substream command like:\n \n ```shell\n GOOGLE_APPLICATION_CREDENTIALS=/path/to/cloud_credentials.json substream ...\n ```\n \n On run, a temporary bucket will be created, the file uploaded, and\n on completion or error, a [context manager](https://github.com/mdegans/substream/blob/master/substream/tempbucket/__init__.py)\n ensures bucket deletion.\n \n Please be careful with these credentials as cloud resources can be expensive,\n so make to store them securely if you do store them at all, and make sure all\n project buckets are deleted manually _even if the app reports they have been\n successfully deleted_.\n\nFull Usage:\n```shell\nusage: substream [-h] -i INPUT -o SRT_FILENAME [--language CODE] [-v]\n\nTranscribes an audio file or .jsonl dump to .srt using the Google Cloud\nSpeech-to-Text API\n\noptional arguments:\n -h, --help show this help message and exit\n -i INPUT, --input INPUT\n mono audio file (flac, opus, 16 bit pcm) (or) gs://\n uri to audio file (or) intermediate .jsonl dump\n (default: None)\n -o SRT_FILENAME, --output SRT_FILENAME\n .srt filename (default: None)\n --language CODE https://cloud.google.com/speech-to-text/docs/languages\n (default: en-US)\n -v, --verbose extra logging (default: False)\n```\n\nSample Usage with a local file:\n```shell\nsubstream -v -i test.flac -o test.srt --language en-US\n```\n\nSample usage with a URI:\n```shell\nsubstream -v -i gs://my-bucket/test.flac -o test.srt\n```\n\n## Uninstalling: ##\n```shell\npip uninstall substream\n```\n\n## FAQ ##\n* __Why the long-running API rather than the streaming API?__\n \n The long running API is more accurate.\n\n* __What is the .jsonl file?__\n\n Each stripped line in the file is a string containing a json representation\n of a word with it's start and end timings. Later versions of this program\n may accept the .jsonl file to format the sentences in a better way without\n having to re-run the audio transcription.\n\n* __Known Issues:__\n\n * 'walls of text' caused by people speaking without interruption. Some \n subtitles may have to be manually split using a .srt editor.\n \n * Speaker identification is currently broken in the long running \n api for long files, so splitting on this is curently disabled.\n (this exacerbates the above point)\n\n * Progress report is unimplemented by the long running API currently.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "substream", "package_url": "https://pypi.org/project/substream/", "platform": "", "project_url": "https://pypi.org/project/substream/", "project_urls": { "Bug Reports": "https://github.com/mdegans/substream/issues", "Source": "https://github.com/mdegans/substream/" }, "release_url": "https://pypi.org/project/substream/0.1.1/", "requires_dist": null, "requires_python": ">=3.6", "summary": "Transcribes audio files to .srt", "version": "0.1.1" }, "last_serial": 5364920, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "7308977ebdcb6792e7f1f35ad616aeff", "sha256": "b7428c4942b5ad4ad4cb7789a60ce52064da507d1e2d90ee8b4d4f8ed1b3c94a" }, "downloads": -1, "filename": "substream-0.1.0.tar.gz", "has_sig": false, "md5_digest": "7308977ebdcb6792e7f1f35ad616aeff", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 8528, "upload_time": "2019-06-06T00:11:56", "url": "https://files.pythonhosted.org/packages/4f/15/078607214bdbd95ebb56d1e1d723d1351e5ce27f450bbb25b63fde22852c/substream-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "756e99c9458a860de7544ec564884027", "sha256": "1387585314e257921e8efb9a640ab5548afb66a4624d269380689f2504488ba2" }, "downloads": -1, "filename": "substream-0.1.1.tar.gz", "has_sig": false, "md5_digest": "756e99c9458a860de7544ec564884027", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 8518, "upload_time": "2019-06-06T00:54:29", "url": "https://files.pythonhosted.org/packages/dd/75/0400c7628e9b411b5cf31114167e313dbc0eacd85fff698752f05af37c2f/substream-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "756e99c9458a860de7544ec564884027", "sha256": "1387585314e257921e8efb9a640ab5548afb66a4624d269380689f2504488ba2" }, "downloads": -1, "filename": "substream-0.1.1.tar.gz", "has_sig": false, "md5_digest": "756e99c9458a860de7544ec564884027", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 8518, "upload_time": "2019-06-06T00:54:29", "url": "https://files.pythonhosted.org/packages/dd/75/0400c7628e9b411b5cf31114167e313dbc0eacd85fff698752f05af37c2f/substream-0.1.1.tar.gz" } ] }