Design and Implementation of Distributed File System for Massive Data
|School||Huazhong University of Science and Technology|
|Keywords||Distributed file system Massive data Meta data|
With the rapid development of Information Times, the total volume of data explosively grows. On the one hand, we have the big information source from the massive data. On the other hand, it is difficult for us to store and analyze the massive data. To solve that problem, the Google Company shows their solution: The Google File System and Map/Reduce programming model. The Google File System is used to store massive data. The Map/Reduce programming model is used to analyze massive data. In this paper, we focus on the store and management of massive data.By analyzing the design of many distributed file systems, we design and implement the KiddenFS which is a distribute file system for storing and managing massive data.In the first part of the paper, we describe the design and analysis of KiddenFS. First, we analyze the functional and non-functional requirements. The second, we show the key design in KiddenFS, including the logic structure of the KiddenFS, the mode of data store and the strategy of load balance and so on. The last, we design the architecture of KiddenFS and show the data flow of key operation. The KiddenFS is composed of meta data server, data server and client. Meta data server manages meta data, data server manages data and client provides the interface of KiddenFS.In the second part of the paper, we describe the key algorithms and data structure of KiddenFS, including data management, meta data management, the communication between data server and meta data server, the communication between data server and client and the communication between client and meta data server and the interface of file system.The last part of the paper, we test the KiddenFS and give a example to show how to use the KiddenFS in your application.