Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems

Research on Multi-dimensional Data Visualization Clustering Methods Based on Hypermap and Its Applications

Author LiLiZuo
Tutor GuoChongHui
School Dalian University of Technology
Course Management Science and Engineering
Keywords FastMap HyperMap visual clustering educational data analysis creativeindustry
CLC TP311.13
Type Master's thesis
Year 2013
Downloads 51
Quotes 0
Download Dissertation

Multi-dimensional data visualization methods can display multi-dimensional data in a low visual space so that it will be easier for users to find hidden features in the data. So visualization is significant for multi-dimensional data analysis and knowledge discovery. In order to ensure the reliability of the visualization results, it is needed to maintain structure information of data in dimensionality reduction process. This paper puts forward a dimensionality reduction method based on HyperMap from the point of view of optimization. On one hand, the best object space coordinate axes are selected, and on the other hand, weights are optimized. This method can keep the structure information of the original data to the full extent in visualization process. Further the visualization method and clustering methods are combined and a visual clustering method based on HyperMap is developed. Finally, the visual clustering method is applied in education data analysis and creative industries data analysis. Results show that this method is of practical value. The main research work is as follows:(1) Firstly, the FastMap method and HyperMap method are introduced and their advantages and disadvantages are analyzed. HyperMap method is improved based on FastMap method. The essence of the two methods is that projects multi-dimensional data into low-dimensional space, and then visualizes the low-dimensional data and observes the structure information of the original data through the visualization results. Dimensionality reduction process can be divided into two steps. The first step is to select pivot points to determine the coordinate axes in a target space. And the second step is to calculate the projection coordinates of all sample points. FastMap method and HyperMap method have two major problems. One is that the two methods can’t choose the best pivot points, therefore can’t choose the best axes in target space. The other is that they don’t specify how to reach the minimum information loss, in spite that the degree of information loss metrics is used.(2) This paper proposes an improved HyperMap visualization method and combines it with clustering methods to develop a visual clustering method. To solve the problems of FastMap method and HyperMap method, this paper mainly makes two improvements. On one hand, points of the maximum distance in dataset are selected as pivot points to decide the best coordinate axes. On the other hand, an optimization model is established using stress function, and the best weight combination can be got through optimizing the model parameters. Both improvements can help minimizing the information loss in visualization process. In addition, the visualization results can be rotated along any direction, which can eliminate the influence of different observation angles on the visualization results. Numerical experiments show that the improved HyperMap method can improve the performance of HyperMap method, and can be effectively applied to multi-dimensional data visualization analysis. After that, the improved visualization method and clustering methods are combined to develop a visual clustering method. Numerical experiments show that the visual clustering method is effective and practical.(3) The visual clustering method is applied to education data analysis and creative industry data. To solve classification guidance in education data analysis, a systemic method including data processing, visual clustering and classification guidance scheme is established, providing references for classification guidance. In addition, the creative industry data of60major cities in China is visually analyzed. The overall data is sorted and divided into groups and operation data for visual analysis is generated. The experimental results show that the visual clustering method can intuitively display data information and can be combined with clustering methods to improve clustering accuracy and rationality. It has an practical value for real data analysis.

Related Dissertations
More Dissertations