Dissertation
Dissertation > Industrial Technology > Radio electronics, telecommunications technology > Communicate > Electro-acoustic technology and speech signal processing > Speech Signal Processing > Speech Recognition and equipment

Compensation Methods of Different Speech Coding for Speaker Recognition

Author LiXueLin
Tutor HanJiQing
School Harbin Institute of Technology
Course Computer Science and Technology
Keywords Speaker identification Text-independent Speech coding Maximum A Posterior estimation Maximum Likelihood estimation Score compensation
CLC TN912.34
Type Master's thesis
Year 2008
Downloads 59
Quotes 0
Download Dissertation

There are so many advantages for speaker recognition technique, including flexibility, economy, accuracy, extensibility, and so on, thus it has a broad application future in biometrics recognition field. Although the system performs well in the lab, the performance descents rapidly because of the influence of various factors in the real world. One of the main factors affecting the performance is the code mismatch between training data and testing data. Especially in speaker recognition under network environment, the available training data is from some speech coder, however, in actual use the testing data is from another speech coder. In this situation, the performance of speaker recogonition is seriously affected. In order to improve the speaker recognition performance under network environment, enhance system practical level, first of all, we need to resolve speech coding mismatch problems, that is eliminating the influence resulted from the code mismatch in training and testing conditions.This paper mainly studies compensation approaches, which effectively overcome the impact of different speech coding, so as to improve the speaker recognition performance under network environment. These approaches compensate mainly in the feature domain and scoring domain. In encoding feature compensation, the MAP (Maximum A Posterior) method and the ML (Maximum Likelihood) method are applied to the speaker recognition systems. In scoring compensation, the likelihood ratio score normalization method that has been used in the channel compensation is adopted, so as to further improve system performance. We recognize firstly by GMM(Gaussian Mixture Model), and then make secondary judgement based on using coding score normalization, and finally get the recognition results. The baseline system we used is text-independent speaker identification system. Experimental results show that by firstly using MAP method to coding compensation, then using likelihood scores method to scoring compensation, the best recogonition rate is 83.4% in open set tests.

Related Dissertations
More Dissertations