Research on a Few Key Issues in Nonlinear Dimensionality Reduction Algorithms
|Course||Applied Computer Technology|
|Keywords||Nonlinear dimensionality reduction Evaluation model Incremental algorithms Classification Clustering|
Dimensionality reduction is one of the most important research tasks in machine learning, which is a significant procedure to deal with multi-dimensional data. Particularly, nonlinear dimensionality reduction (NDR) techniques have been a focused research field. In this paper, a few key issues in NDR algorithms are studied.Firstly, we analyze and compare three evaluation models: based on the stress function, based on the residual variance and based on DY-DX representation. An evaluation model based on the variance of distance ratios is proposed. Experiments illustrate that this model not only can evaluate results from the same algorithm with different parameters, but also can compare results from different methods. Moreover, it is discussed how to use the stress function, the residual variance and the variance of distance ratios to select the neighborhood parameter and the dimension of the low dimensional space.Secondly, for the incremental algorithm issue, we obtain an incremental algorithm based on distance preserving (IADP) by improving the incremental ISOMAP and present an incremental algorithm based on topology preserving (IATP) and an incremental algorithm based on k nearest neighbor projecting (IAkNNP). They can all pertinently map objects outside training sets into the embedded space. Theoretical analysis and experiments show that results from IADP are better while the efficiency of IATP is higher. However, they are both only extensions of ISOMAP. IAkNNP is a generalization of any NDR method besides with better results and higher efficiency. For the embedded coordinates of new data are irrelative to the embeddings of training sets, IATP is insensitive to noise. For the other two methods, the effect can be ignored when noise in training sets is properly treated.Lastly, it is discussed to apply NDR techniques to classification and clustering analysis. Experiments in fingerprint and text classifications demonstrate that the efficiency of algorithms is improved and memory requirements are reduced under without loss of the accuracy by combining NDR algorithms with classifying techniques. Experiments in clustering analysis indicate that the clustering algorithm based on NDR methods can find clusters with any shape and results are superior to those from K-MEANS algorithm.