Analysis and Implement of Data Management Based on ETL
|School||East China Normal University|
|Keywords||Data cleansing Data Replication ODI|
With the rapid development of the computer network and database technology as well as the diversity of people access to data means , various data resources increasingly rich dramatic increase in the amount of data , and the University, as an important member of the community of nations , the degree of information technology and network ensue a tremendous change in many sectors in varying degrees, rely on computer software to assist in the completion of work , improve business processes through the use of these software capabilities and efficiency of the office . However , an increasing number of different types of information and data to the database management brings a lot of problems , mainly in the two major aspects of the data cleaning and data replication to correct data errors such as how to avoid wrong decisions , to reduce the risk of decision-making ? How to between the various departments both flexible information exchange and sharing , but also unified management and use ? currently the main method is synchronous replication of data cleaning and data on these data . Metadata cleaning so we get is credible , safe , consistent , and then after cleaning the data through data replication tools poured into public databases , so that the various departments of the school to be able to share data resources . This paper introduces the principle of ETL (Extract, Transfer, Load) - based data cleaning and data replication , and apply them in practical work , the main work is as follows : ( 1) Introduction cleaning technology at home and abroad at this stage data replication and data its application ; ( 2 ) pointed out between the various departments of the University of the data source , the problems of data quality and data consistency ; ( 3 ) analysis of data quality problems exist cleaning and replication strategies and design ; ( 4 ) describe how use of data cleaning and replication the tools Oracle Data Integrator ( referred ODI) extracted the data from various data sources , in accordance with predetermined rules to clean , and then transfer to copy loaded into the target database (ie, public database ) , in order to achieve data the purpose of sharing resources . ( 5 ) papers in the prevention of suspicious data cleaning strategies and how to balance the efficiency and performance of data replication needs to be further discussion .