Study on Application of Parallel Computing Based on GPU
|School||National University of Defense Science and Technology|
|Course||Electronics and Communication Engineering|
|Keywords||Algorithm optimization Graphics Processing Unit(GPU) ComputeUnified Device Architecture(CUDA) Data Encryption Standard (DES)|
In recent years, graphics processing unit GPU(Graphics Processing Unit) in terms ofarchitecture and programming level development is very rapid, computing power continues toincrease. High-speed computing power of the GPU is now increasingly being used in generalpurpose computing. This thesis is mainly on GPU architecture and CUDA programming modelfor an in-depth analysis, summarizes the algorithm in parallel implementation and operationoptimization method in heterogeneous environments. Using highly parallel GPU computingoperating features and SIMT（Single Instruction Multiple Thread）the modalities for theimplementation of the DES encryption algorithm for the parallel operation, proposed a DESencryption technology on GPU parallel implementation feasibility analysis, there is a technicaldifficulty and optimization strategies for parallel execution, the following results have beenachieved:1、Under GPU hardware features and CUDA(Compute Unified Device Architecture)programming model of operation way on application algorithm in heterogeneous environmentof parallel implementation and operations optimization for has research summary; mainincluding: clear algorithm parallel implementation process in the calculation task andcalculation core of division method; by meet merged visit save mechanism further optimizationmemory access, avoid memory bandwidth became operations bottleneck; by function calledimprove hardware resources of utilization; ensure application algorithm in parallelimplementation process in the hardware of calculation resources to balanced distribution; withthe use of memory’s high-bandwidth and high-speed advantages for data traffic optimization.2、According to the algorithm in the CPU+GPU optimization of parallel implementationstrategy on heterogeneous platforms, combined with implementation of the DES encryptionalgorithm features, the DES encryption algorithm in CUDA mode for parallel execution, betteracceleration effect had been made. Through research and analysis of the encryption algorithm,presented the DES encryption algorithm on the GPU is porting and optimization strategies.According to the implementation process of the DES encryption algorithm and hardwarecharacteristics, this thesis divides the DES encryption algorithm into three core functions:kernel_mov circular left shift operation, parallel operation of the Statute of kernel_xor,kernel_LaR iteration operations. DES encryption algorithm optimized calculation core division,mainly from the division of computing tasks, data communication, data memory, kernelfunction, branch elimination, asynchronous calls, streaming operation optimization and manyother aspects. These measures not only greatly improve the operational performance of thealgorithm as a whole, and deep mining of parallel implementation of GPU hardware efficiency advantages. Test results show that DES encryption algorithm in the NVIDIA GeForce8800GTX operation platform relative to Intel Pentium IV, its accelerated ratio of3.86times theoperation proceeds.