Cloud storage system for mass data

Author AiMing
Tutor SunMingMing;ZhangZhongYang
School Nanjing University of Technology and Engineering
Course Computer technology
Keywords Huge amounts of data Cloud Computing Cloud Storage GlusterFS Nutch Hadoop Mahout Text Clustering
Type Master's thesis
Year 2012
Downloads 390
Quotes 1
With the Internet , mobile Internet and the development of the Internet of Things , the number of online users is increasing , the data also showed explosive growth , has come the era of massive data , especially in the Internet , telecommunications, finance and other industries , almost to the \itself the point . Faced with such vast amounts of data , the first plain : the size of this data has exceeded the load capacity of a single machine , how to build a large-scale , high- efficiency , easy to expand , highly reliable storage system is an urgent need to address the issues; followed in the information society , information is critical in the mass data , there is an important trend in the socialization of data , which is what we usually call unstructured data (for example : text , image , audio , video , etc.) , how to obtain useful information from the vast amounts of data , has also become a major challenges in recent years, the Internet . Based on the issues raised above , mass data storage and massive data mining research . Due to the performance of the network data in a variety of forms , in order to facilitate research , scientists in management literature , for example, the the mass data source specific electronic document data into the network . On this basis , through cloud storage and cloud computing platforms successfully build a cloud storage system for mass literature data , the system to achieve the the literature data management and analysis . System first requires the user to register, then the user can upload documents ( PDF files ) are stored in the cloud , then the user can to manage upload their own literature , such as the increase in the literature , delete the literature , the system also provides literature information retrieval and clustering analysis functions .

