Semi-supervised Band Selection for Hyperspectral Remote Sensing Data
|Course||Communication and Information System|
|Keywords||Hyperspectral data Classification Band selection Semi-supervisedlearning Graph Laplacian Sample selection|
Hyperspectral remote sensing, which could acquire the continuous spectrum ofground objects from visible to near infrared bands, is an important measure ofobtaining remote sensing information. Since the hyperspectral data ishigh-dimensional and massive, it has brought us great challenges in the subsequentdata processing and analysis. Dimensionality reduction through band selection forhigh-dimensional data has become a significant pre-processing method inhyperspectral data processing. Currently, most of band selection methods are onlybased on the label information, and the information from the small size of labeledsamples usually misleads these supervised band selection. The semi-supervised bandselection, combining information in labeled and unlabeled samples, is an efficientmeasure of dimensionality reduction in case of limited labeled samples. Against thisbackground, this dissertation researches the semi-supervised band selection forhyperspectral remote sensing data and the main contents are as follows:(1) Systematically study on band selection theory and semi-supervised learning.Do a summary of several semi-supervised band selection methods with theiradvantages and disadvantages.(2) Propose GST_FS which is a semi-supervised band selection based on graphLaplacian and self-training idea. The method first puts forward the semi-supervisedcriterion for feature ranking to generate the initial band subset. Then, the supervisedclassification are carried out based on the band subset and some unlabeled sampleswith higher confidence values are added into the labeled sample set. Afterwards theband subset is updated according to the feature ranking based on the newly generatedlabeled and unlabeled data, and is used for classification. The process is repeated toobtain the final subset. The experiments on hyperspectral data sets are carried out tomake comparisons among GST_FS, several unsupervised, supervised andsemi-supervised band selection methods. The results show that GST_FS can produce the band subset with better performance.(3) Introduce sample reduction for graph-based method to cut down the databundle in constructing the graph. In the method, the principal component analysis isfirst used for the dimension reduction. Subsequently, the watershed algorithm isapplied for over-segmentation of the reduced images. Finally, some percent ofunlabeled samples are randomly selected from the clusters. These chosen samples arecombined with the labeled samples to construct the graph in GST_FS to complete thesemi-supervised band selection. The experiments on hyperspectral data sets show thatthe proposed has better performance than the simple sampling approach. The method,selecting the representative unlabeled samples for the graph-based algorithms, notonly reduces the computational burdens greatly, but also enhances the algorithmperformances.