Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems

Query Processing and Optimization in Massive Multi-Database Integration

Author LiuYuYang
Tutor LiJianZhong
School Harbin Institute of Technology
Course Computer Science and Technology
Keywords Schema mapping Query Processing Query decomposition Query Optimization
CLC TP311.13
Type Master's thesis
Year 2008
Downloads 177
Quotes 0
Download Dissertation

With the development of Internet,all sorts of data sources are increasingrapidly.The type and struct of these data sources are different.But the data, whichare the core of all applications,are still stored in different systems with different man-ners and live by themselves in distributed databases. With the steady increase ofapplication requirements, more and more people want to access and manipulate theuseful data among multiple massive data sources and achieve the interoperability ofmultiple computer systems and different data sources. However, these data sourcesmay not only geographically locate at multiple autonomous domains in heterogeneousdatabase with different data formats, storage modes and access control policies, butalso logically differ from each other in data models, manipulation languages and datasemantics.Moreover,the sharing ability,modes and contents of the sources may changeat any time.As a result, designing a multi-database integration system supporting thecommon data model and a uniform query language is a better way to implement thistype of interoperation.The system can hide most of the differences of access methodsand user interfaces of multiple data management systems.It also provides an infor-mation interoperating platform as a common interface to access multiple data sourcesand combine the intermediate query results from these sources.Query processing is one of the key techniques in multi-database integration sys-tem.Query decomposition,result combine and query optimization are the central prob-lems for query processing.First,The dissertation firstly defines the basic concepts ofquery processing and gives the architecture.After analyzing the characteristics andrequirements of the system, we choose M-SQL as the query language.Based on theabove discussion,the basic principles and algorithm of global query decomposition aregiven,and the semantic equivalence of the algorithm is also discussed.Second,severalresult combine algorithms are proposed.In fact,result combine is the process ofscheduling the query execution plan and combining the intermediate results accord-ing to the post-processing operations.Basic join algorithm is proposed.A nonblockingresult combine algorithm is proposed, which include a loading algorithm for onlinejoin.The client could get result as soon as possible.Last,some rewrite sub query opti- mal methods are proposed to optimize query processing.The above theoretical principles and practical techniques are adopt for develop-ing a Web Services based multi-database integration system, which has the functionsof query decomposition,result combine and query optimization.It could provide thetransparent access to multiple data sources,such as Oracle, Sybase and DB2.The re-sults of performance analysis and evaluation of the system are showed in the end.

Related Dissertations
More Dissertations