Connected Digit Speech Recognition Based on the Acoustic Universal Structure
|Course||Signal and Information Processing|
|Keywords||The Chinese continuous digital speech recognition Global acoustic structure Histogram equalization Feature classification|
The Chinese continuous digital speech recognition in the real world has a very broad application prospects in automatic voice dialing of telephone and telecommunications systems, identity card number confirmed, smart appliances for remote control of the TV channels and air conditioning temperature has important applications value. The Chinese continuous digital speech recognition is an important branch of speech recognition, its main difficulty is that, first of uncertain length of the string of numbers, it is difficult to accurately distinguish the boundary between the words in a continuous string of numbers; followed by a continuous string of numbers in the figures are any combination of , no, we can draw on the knowledge of grammar; Chinese digital pronunciation features lead to identification difficulties, such as: a high degree of confusion between digital voice, Chinese continuous string of numbers in various digital coarticulation phenomenon is more serious. In addition, voice communications will inevitably be affected by differences speaker line interference, the impact of environmental noise, caused by signal distortion, leading to the robustness of the recognition system is quite fragile. This research project consists of two aspects: (1) continuous digital speech recognition using the global acoustic structure described. Variation on the transmission and processing of line noise and speaker in accordance with the descriptive voice intrinsic relationship, with a robust global acoustic structure aus proposed a double structure suitable for continuous digital speech recognition speech model matching strategy, in response to no learn knowledge of grammar case, does not require a lot of training templates of arbitrary length can be achieved as long as a single digital voice training data using fewer continuous digital voice recognition, and no adequate training corpus and the common channel normalization techniques solve the robustness problem of speech recognition and continuous digital speech recognition, speaker differences. (2) using the histogram equalization of the robustness of speech recognition. In speech recognition, the additive noise is also an important reason for the decline lead to system performance. Histogram equalization method is a non-linear compensation transform technology, more traditional anti-noise method based on linear transformation technology to further improve the robustness of the system. But the actual recognition system, in addition to the noise caused by the speech characteristic of the nonlinear distortion, there are also the training and test data of the speech characteristic class distribution inconsistencies, thereby making it difficult to ensure the traditional histogram equalization method to exert its advantages. This paper presents a histogram equalization feature-based classification method, the experimental results show that the low signal-to-noise ratio, both stationary noise and non-stationary noise environment, compared with the traditional method of histogram equalization to further enhance recognition system robustness.