GPU-accelerated simulation of high-speed particle collisions
|School||National University of Defense Science and Technology|
|Course||Computer Science and Technology|
|Keywords||Molecular dynamics simulation GPU Computing CUDA programming model Brook language Domain decomposition method Particle index method|
Molecular dynamics (Molecular dynamics, abbreviated MD) simulations as an important computer simulation methods are widely used in biology, chemistry, materials science and many other disciplines. However, computing performance has always been a major obstacle to restrict use MD. In recent years, GPU computing resources as a novel become a research hotspot. Compared with the traditional CPU, GPU has a higher performance, lower power consumption and higher price. Therefore, the use of GPU-accelerated molecular dynamics simulation, the simulation can save time and improve simulation scale, so that the molecular dynamics simulation can be more widely applied to the actual project go. In this paper, a high-speed collision of the particle model for the study, based on NVIDIA CUDA programming language to implement the model and Brook GPU-accelerated molecular dynamics program, and storage structure for the GPU and multi-GPU optimization algorithm was mainly achieved the following results : 1, an optimized domain decomposition algorithms. This improved the traditional domain decomposition algorithms, general-purpose processor and GPU on the points twice in the calculation of molecular dynamics simulation task decomposition, the first division to ensure load balancing, the second division to solve communication overhead and data multiplexing problems. 2, an improved particle index method. General processing nodes by sorting the particles, the storage address of the adjacent particles as close as possible. When the acceleration of threads on a node from the global memory reads particle information, to exhibit the characteristics of data locality can reduce the thread to read data from the global memory of times, thus saving time. 3, the storage structure for the GPU optimize the program. For on-chip shared memory split design features to achieve a single-precision arithmetic threads access shared memory without conflict, reducing the stream processor idle time. 4, the use of multi-GPU to accelerate the process. Using common message-passing interface (MPI) protocol parallel partitioning between general-purpose processor, enabling each node GPU parallel computing, to meet the more rapid molecular dynamics simulation requirements. In this paper, GPU accelerated molecular dynamics simulation accuracy and performance testing. The results showed that, GPU accelerated the MD algorithms have a significant effect. When a particle size of 432 000, the AMD HD4870 accelerated by an MD program performance by 4.8 times, while accelerated by Tesla C1060, MD 6.5 times performance improvement program. When using multi-GPU accelerated the program, MD improve the performance of the program 11.2x. Meanwhile, the GPU-accelerated MD program to ensure the correctness of the results.