Design and Implementation of XBRL Data Analyzing System Based on Hadoop
|School||Dalian University of Technology|
|Keywords||XBRL Data Hadoop Platform Data Analyzing Map/Reduce Model|
XBRL (eXtensible Business Reporting Language) is an application of XML in the financial reporting information exchange. It is able to meet these requirements such as specific recognition and parsing financial statements, simplifying preparation and definition the financial reporting’s information, reducing costs of network information exchange and improving reliability and accuracy of business reporting. Enterprises can use XBRL to process financial data from collection to reporting automatically, and the reports generated in XBRL format are easier for the users including investors, policy makers and regulators, etc. to do data storage, mining, analyzing and comparison work faster and more effective.With the promotional use of XBRL in the international financial institutions, quarterly financial reports of listed companies have begun to use the XBRL standard reporting. As a result, financial institutions will receive a flood of XBRL financial reporting data for each quarter. These reports which record the listed company’s financial information of each quarter have very high value in data mining, analysis and research. Distributed computing platform Hadoop and distributed programming model Map/Reduce which solve the problem of massive data processing and analyzing make storage and analyzing of information based on XBRL data possible.This article designes and implments the XBRL data analyzing system based on Hadoop. Firstly, it analyzes the current situation of XBRL and Hadoop studies, and sorts out functional and performance of the massive XBRL data analysis system. Then build models for the XBRL specification, taxonomy, instance and studies related XML parsing technic that provide an important technical support for the data storage and analysis of XBRL. Secondly, the article gives the overall design of data storage and analysis processes. The XBRL data are extracted and transformed by Map/Reduce and stored in HDFS and HBase. Thirdly, analyze the XBRL data in Hive and get the investment feasibility assessments by calculating the listed companies’ financial indicators though IAHP (interval analytic hierarchy process). Finally, store analysis data in HBase and provide massive information query.