Feature Extraction, Selection and Combination in Lipreading
|School||Harbin Institute of Technology|
|Course||Computer Science and Technology|
|Keywords||lip-reading feature extraction feature selection feature fusion Gabor wavelet transform AdaBoost manifold learning|
Lipreading is the technology that uses computer to recognize the lip motion se-quence. It involves pattern recognition, artificial intelligence, image processing, andso on. This paper mainly focuses on the feature extraction, selection and combinationunder the single-visual channel and the main work includes:1. In feature extraction, this paper analyzes the applications of manifold learningin lipreading. Manifold learning is a nonlinear feature extraction method. In order toeffectively extract the intrinsical feature of lip motion, this paper analyzes the applica-tion of two manifold learning methods (LLE and Isomap) in lipreading. The ordinaryLLE and Isomap can not effectively get the embedding of new sample. A novel ker-nel based LLE and Isomap method is proposed. Experimental results show althoughthe proposed method can not get better recognition performance than traditional lin-ear methods, it can extract the intrinsical features of lip motion more effectively thantraditional methods.2. In feature selection, this paper present an optimal Gabor kernels selectionmethod based on AdaBoost. Gabor transform gets many attention because of its goodrecognition performance in pattern recognition field. However, the feature dimension-ality extracted by Gabor is extremely high, which results in needing a huge numberof training samples and restricting the applications of lipreading technology in realworld. In consideration of the appearance symmetry of mouth region and the direc-tion of Gabor kernel functions, this paper presents an optimal Gabor kernels selectionmethod based on AdaBoost. The proposed method first divides the entire mouth re-gion into four sub-blocks, then adaptively select optimal Gabor kernels function forevery sub-block based on AdaBoost. Experimental results validates that the selectedGabor kernels have the same direction with the appearance of mouth region. The di-mensionality of resulting feature vector was significantly reduced. The recognitionperformance is superior to traditional methods.3. In feature combination, this paper presents a novel classifier level combina-tion method which combines the global and local feature. In literature, a large numberof pixel-based feature extraction methods were proposed. Traditional feature extrac- tion methods exploit only global or local features. However, Some psychologicalevidences show that people use both global and local features for object recognition,in some extent, people use global features before analyzing the image in detail. Mo-tivated by this study, this paper presents a novel classifier combination method whichcombines the global and local feature. The global classifier uses Discrete FourierTransform (DFT) to extract global features and the local classifier uses block-basedGabor wavelets transform to extract local features. The final classifier combines theoutputs of both global and local classifiers. Experimental results show that the com-bined classifier gives distinctly superior recognition rate than each of the individualclassifiers.