GMM Based Connected Digits Speech Recognizer and the State of the Art of Language Modeling for Large Vocabulary Speech Recognizer |
|
Author | ZhuJia |
Tutor | ZhaoHeMing |
School | Suzhou University |
Course | Communication and Information System |
Keywords | Speech Recognition Connected Digits GMM Language Model |
CLC | TN912.34 |
Type | Master's thesis |
Year | 2006 |
Downloads | 166 |
Quotes | 3 |
As a special interface of human-computer interaction, connected digits speech recognition can promote broad applications in the areas of entertainment, communications, industry and military because of its convenience and ease to apply with small memory and computation consumption. The improvement of connected digits SR can also bloom the research of LVSR. It has drawn attention of many researchers for a long time.The purpose of this thesis is to realize a GMM-based connected digits speech recognizer. It can take advantage of the advancement of HMM and GMM, utilize Dynamic Programming technique to realize the Nonlinear Time Alignment between speech feature vectors and Markov state sequences, use Expectation-Maximum algorithm to re-estimate the GMM parameters and finally employ Levenshtein Distance to calculate the Word Error Rate between the recognized and expected results. This recognizer can get quite low WER by testing on SieTill German connected digits speech database.This thesis also focuses on the way of modeling and evaluating the Language Model for LVSR and introduces detailed the Linear and Kneser-Ney Smoothing techniques and the strategy to estimate the discounting parameters using Maximum-Likelihood criterion. The experiments deal with EPPS speech transcriptions,use Cambridge HTK LM tools, SRILM toolkit and the LM tools of SPRINT speech recognition toolkit from Computer Department of RWTH University to make language models,