Research on Mid-Tier Semantic Cache for Massive Database
|School||National University of Defense Science and Technology|
|Course||Computer Science and Technology|
|Keywords||Massive Database Semantic Cache Query Processing Cache Replacement Consistency Maintenance|
With ever increasing applications of massive database in critical business, how to improve query performance has become a key problem need to be solved urgently. Since there are large numbers of aggregate queries in massive database applications, whose execution tends to be time consuming and costly, the promotion of aggregate queries’ efficiency will largely improve the performance of the system definitely. Semantic cache technology is effective to improve performance by reusing results of previously answered queries. With the population of three-tier architecture, mid-tier semantic cache technology becomes the focus of the research on performance optimization. However, the semantic cache technology of today couldn’t yet meet the demand of high performance in processing aggregate query in massive database applications due to the lack of support for it.Based on the analysis of current research on semantic cache technology in detail, the thesis aims to improve the performance of aggregate queries in massive database applications by proposing novel mid-tier semantic cache mechanism. The thesis is mainly devoted to mid-tier semantic cache mechanism for aggregate queries, aggregate query processing, semantic cache management and semantic cache consistency maintenance, and makes the following contributions:1. The thesis proposes a mid-tier semantic cache mechanism that supports aggregate queries. The mechanism defines the semantic cache as a collection of several cache items and deploys cache at mid-tier. Therefore the mechanism combines the merits of both query-shipping and data-shipping, and effectively improves the utilization of cache by using the semantic relationships among user queries.2. The thesis develops a mechanism that processes aggregate queries based on semantic cache. After discussing the match of aggregate query and the cached item, the mechanism gives definitions for match and puts forward the match determinant method. Then the mechanism make study on aggregate query processing with different query matching type and finally presents an algorithm to answer aggregate queries based on cache. Performance tests demonstrate that this mechanism improves performance of aggregate queries greatly.3. The thesis proposes an effective semantic cache management mechanism. Firstly this mechanism introduces the concept of virtual cache and manages cache by dividing it into virtual cache, temporal cache and persistent cache. Thus the flexibility of cache management is improved. To reduce the redundancy and maintenance cost, this mechanism presents the approach to merge related cache items on proper conditions. Finally two cache replacement policies, LFURC and RBHR, are put forward. Based on the feature of time, access frequency and cache structure, LFURC will replace the cache item that is least used in a most recent time period. Based on the fact that users often access hot region, RBHR sets the replacement value according to comprehensive consideration of all factors in cache descriptions. Performance tests demonstrate that virtual cache and cache mergence could improve query performance effectively and the policies of LRURC and RBHR excel LRU and LFU.4. The thesis presents a cache consistency maintenance policy for massive database applications. Firstly we analyze several typical consistency maintenance policies and then discuss cache consistency maintenance from the point of view of incremental maintenance. At last a periodical incremental maintenance policy is proposed to preserve cache consistency based on the characteristics of massive database applications. Experiment results demonstrate the validity of this policy.5. Based on the above research and parallel database middleware StarTPMonitor, the thesis designs and implements mid-tier semantic cache named StarCache, which is flexible to be configured. Nowadays, StarCache has been applied to a national project of Large Scale Transaction Processing System. Both application and experiment results demonstrate that StarCache could optimize aggregate queries in massive database applications effectively to meet the requirement of high performance.