Research on Selection Algorithm of Materialized View in Data Warehouse
|Keywords||Data warehouse Materialized views Static selection Dynamic adjustment|
The data warehouse is a subject-oriented, integrated, non-updating and changing over time, data collection used to support management decision-making. It maintains vast amounts of data, the supported complex queries in the form, usually need to access large amounts of data, and decision support systems must respond quickly to the query. How efficient management is so much data is one of the problems facing data warehouse, the materialized view is an important means to solve this problem, but it needs the extra space of the system to store the system and the need to sacrifice the cost of maintenance (maintenance ), therefore, choose which views to be materialized is an important research topic. The research content of this article is to consider both static and dynamic materialized view selection problem. Static materialized views selection algorithm is based on the query distribution probability has been provided by the user, or the premise of these queries in the integrated data is distributed evenly. This article first introduced in Chapter three common materialized view static selection algorithm, and analyze their strengths and weaknesses; followed by optimized cost model, which takes into account not only the query cost, but also consider the update cost; Last proposed cost model based on improved genetic selection algorithm of materialized views (Genetic Algorithm on Materialized View, referred GAMV), the algorithm is a static selection algorithm. Often can not be established due to the assumption of uniform distribution in the practical application, the user query, the query by the user to provide a probability distribution is inconsiderate, so the introduction of the dynamic adjustment of the materialized view. In the fourth chapter, first introduced the dynamic adjustment programs of the materialized view of the status quo, and analyze their strengths and weaknesses; Second, based on the diversity of user queries, based on the rough clustering materialized view dynamic adjustment algorithm (Rough Set Clustering-Based Materialized View Dynamic Adjustment Algorithm, referred RSCMVDA) The algorithm is in rough clustering based on the materialized view, dynamic adjustment, it is not only to meet the the user queries diversity demand, and can also take into account the dimension hierarchy relationship factors; Finally experiments conclusion, the increase of the collection with the user query, the practice of combining GAMV algorithm and RSCMVDA of algorithm will be more optimized than simply using genetic algorithms, this is because with the query set increases, the user check the distribution of the probability of a significant change occurs is relatively high.