Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Pattern Recognition and devices > Image recognition device

Research on Sparse Image Representation and Coding Model

Author YiLiQi
Tutor ZhaoDeBin
School Harbin Institute of Technology
Course Computer Science and Technology
Keywords Image compression human visual system sparse coding curvelet
CLC TP391.41
Type Master's thesis
Year 2008
Downloads 360
Quotes 0
Download Dissertation

Image compression is the key technique of image storage and transportation. The state of art compression method use pixel matrices as coding object and achieve compress by eliminate redundant using prediction, transformation, quantization and entropy coding. After years of research and development, it is hard to get more compression gain using those methods whereas the complexity is increasing significantly. Moreover, current coding techniques divide image into macro blocks and use distortion of pixel values to pursue rate distortion optimize, which has poor reconstruction quality. This dissertation contains wide and deep research about coding object based on the human visual system.At first, the achievement of nowadays researches on brains, neurons, psychology and other related area are discussed. A coding oriented human vision model is set up. Specific characteristic which could improve coding efficiency and image quality assessment has been given special interest. After the physic model of visual system, the efficient coding theory of neurons is illustrated including bilinear model, high order linear model and high order non linear model. Then the high order linear model, namely sparse coding model, is picked out from those models as having practical usage trend.The most important part of sparse coding model is to create the over complete base functions dictionary by which the image is coded. In the beginning, learning based method is used. And a sub set of over complete base dictionary is obtained among training images which have been Gaussian whiten. The base functions we got has the character of localization and band pass which is the same as human visual system. Due to its response of sparsely distribution, the set could be used to code images as bases. However, the training based methods suffered from non stable convergence and lack of generalize ability.The multi resolution and multi scale curvelet transform is introduced to overcome the drawbacks of training based function dictionary construct methods. The curvelet transform has excellent approximation ability on two dimensional curves like singular points. At the mean time the coefficients of different oriented sub bands base functions have the character of sparseness which means the main structure of an image could been reconstructed using only small amount big coefficients. Experiment results show that images reconstructed by 10% big coefficients could meet the base visual demands and when use 50% big coefficients the visual distortion is hardly noticeable. This method could been used as the image representation part with which the vision based image coding and decoding framework with great academic and practical importance.

Related Dissertations
More Dissertations