Design and Implementation for Performance Optimizing of Small File for SQUID Storage System
|School||Shanghai Jiaotong University|
|Keywords||Squid Massive Small File Storage I/O Performance Optimization|
Boosting web browsing speed and optimizing user experience are a common challenge and long-term task for all internet service providers. As a fundamental method to improve web service, proxy servers have been highly regarded in industry.Squid is a typical and the most widely used proxy server in the world, supporting data object proxy of HTTP, FTP and other web protocols. It also supports multiple access control lists and various operating systems. Squid reduces bandwidth and improves response speed by caching and reusing frequently-requested web pages.In this paper, we made a comprehensive research on the small file cache of Squid, designed and implemented algorithms that can effectively identify hot data and increase cache hit rate. We also implemented the software modules. The main works of this paper are as follows:(1)We start this paper by argueing common methods to improve web service quality, and we introduce proxy servers and analysis the existing storage systems for small files and measures to optimize performances of mass small file storage.(2)We evaluate the system architecture and principles of implementing Squid. Particularly, we analysis the Coss storage module and raise some key points in small file storage.(3)We design and implement a tiered cache scheme for hot and cold data. We describe and implement Relocation Cache method.(4)We argue the key question of discerning hot data in tiered cache and after compare some popular schemes of hot data discerning, we proposed a self-adaptive sliding window scheme, which can further boost the system performance. (5)Using several web proxy server benchmarks, we test and analysis the system before it is optimized, after it is optimized for the first time and after it is optimized for the second time.These tests indicate that the optimized Coss storage system can cache hot data more effectively when facing a heavy access of small files. Therefore, this scheme can reduce the number of disk operations and improve the overall system performance. The findings of this paper are of high reference value to other mass small file storage system designs.