{ "info": { "author": "Guenter Bartsch", "author_email": "guenter@zamia.org", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Operating System :: POSIX :: Linux", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Topic :: Multimedia :: Sound/Audio :: Speech", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "# kaldi-adapt-lm\n\nAdapt Kaldi-ASR nnet3 chain models from Zamia-Speech.org to a different\nlanguage model.\n\nConstructive comments, patches and pull-requests are very welcome.\n\nTutorial\n--------\n\nTo create the language model we would like to adapt our kaldi model to, we first\nneed to create a set of sentences. To get started, download and uncompress a generic set\nof sentences for you language, e.g.\n\n wget 'http://goofy.zamia.org/zamia-speech/misc/sentences-en.txt.xz'\n unxz sentences-en.txt.xz\n\nnow suppose the file utts.txt contained the sentences you would like the model to\nrecognize with a higher probability than the rest. To achieve that, we add these\nsentences five times in this examples to our text body:\n\n cat utts.txt utts.txt utts.txt utts.txt utts.txt sentences-en.txt >lm.txt\n\nwe also want to limit our language model to the vocabulary the audio model supports,\nso let's extract the vocabulary next:\n\n MODEL=\"models/kaldi-generic-en-tdnn_sp-latest\"\n cut -f 1 -d ' ' ${MODEL}/data/local/dict/lexicon.txt >vocab.txt\n\nwith those files in place we can now train our new language model using KenLM:\n\n lmplz -o 4 --prune 0 1 2 3 --limit_vocab_file vocab.txt --interpolate_unigrams 0 lm.arpa\n\nNow we can start the kaldi model adaptation process:\n\n kaldi-adapt-lm ${MODEL} lm.arpa mymodel\n\nYou should now be able to find a tarball of the resulting model inside the work subdirectory.\n\nLinks\n-----\n\n- [Kaldi ASR] \n- [Zamia Speech] \n\nRequirements\n------------\n\n- Python 2\n- Kaldi ASR\n\nLicense\n-------\n\nMy own code is Apache-2.0 licensed unless otherwise noted in the\nscript\u2019s copyright headers.\n\nAuthor\n------\n\nGuenter Bartsch \\<\\>\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/gooofy/kaldi-adapt-lm", "keywords": "natural language processing nlp asr speech recognition kaldi", "license": "Apache", "maintainer": "", "maintainer_email": "", "name": "kaldi-adapt-lm", "package_url": "https://pypi.org/project/kaldi-adapt-lm/", "platform": "", "project_url": "https://pypi.org/project/kaldi-adapt-lm/", "project_urls": { "Homepage": "https://github.com/gooofy/kaldi-adapt-lm" }, "release_url": "https://pypi.org/project/kaldi-adapt-lm/0.1.3/", "requires_dist": [ "py-nltools" ], "requires_python": "", "summary": "Adapt Kaldi-ASR nnet3 chain models from Zamia-Speech.org to a different language model.", "version": "0.1.3" }, "last_serial": 4202874, "releases": { "0.1.3": [ { "comment_text": "", "digests": { "md5": "498ea12488241df69f207ca0d1b0615c", "sha256": "108fb4ee444188095fb74569daa01e95b4de66af47405d0670b6e23b2b95a74a" }, "downloads": -1, "filename": "kaldi_adapt_lm-0.1.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "498ea12488241df69f207ca0d1b0615c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 10248, "upload_time": "2018-08-24T08:35:56", "url": "https://files.pythonhosted.org/packages/1e/45/b253ce5673905b0ff1772b16a3f3160bf39fb94e947fa08d83bf611e7650/kaldi_adapt_lm-0.1.3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "68efdfb55437fae31bea4e1bd77fd165", "sha256": "81a094adee4d27e22e7a6877d4cdd0c9948a70c69fac2822e7a7fb5f22cb2bc1" }, "downloads": -1, "filename": "kaldi-adapt-lm-0.1.3.tar.gz", "has_sig": false, "md5_digest": "68efdfb55437fae31bea4e1bd77fd165", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4090, "upload_time": "2018-08-24T08:35:57", "url": "https://files.pythonhosted.org/packages/41/10/ea00fb6ed3ef886b6f3e31dc1e8fb9adda5fce4eb1f33fc23288ca8d1aff/kaldi-adapt-lm-0.1.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "498ea12488241df69f207ca0d1b0615c", "sha256": "108fb4ee444188095fb74569daa01e95b4de66af47405d0670b6e23b2b95a74a" }, "downloads": -1, "filename": "kaldi_adapt_lm-0.1.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "498ea12488241df69f207ca0d1b0615c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 10248, "upload_time": "2018-08-24T08:35:56", "url": "https://files.pythonhosted.org/packages/1e/45/b253ce5673905b0ff1772b16a3f3160bf39fb94e947fa08d83bf611e7650/kaldi_adapt_lm-0.1.3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "68efdfb55437fae31bea4e1bd77fd165", "sha256": "81a094adee4d27e22e7a6877d4cdd0c9948a70c69fac2822e7a7fb5f22cb2bc1" }, "downloads": -1, "filename": "kaldi-adapt-lm-0.1.3.tar.gz", "has_sig": false, "md5_digest": "68efdfb55437fae31bea4e1bd77fd165", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4090, "upload_time": "2018-08-24T08:35:57", "url": "https://files.pythonhosted.org/packages/41/10/ea00fb6ed3ef886b6f3e31dc1e8fb9adda5fce4eb1f33fc23288ca8d1aff/kaldi-adapt-lm-0.1.3.tar.gz" } ] }