Algorithm Study on Accelerate CV Image Segmentation and Exterior Industrial Image Reconstruction by CUDA
|Keywords||Industrial Computed Tomography Computer United Device Architecture Parallel Computing Image reconstruction Image Segmentation|
Computed tomography(CT)has been extensively applied in modern medical diagnosis and industrial non-destructive evaluation (NDE). Exterior CT will arise under some practical constraint conditions. The most popular computing image reconstructing method SART would be used in order to get high quality reconstruct image for practical point. Parallel computing plays an important role in almost every modern science area, such as numerical weather forecast, numerical bridge building model and so on. There has a critical time requirement of various CT in some special application. In order to reduce image reconstructing time, and with the nature of data parallel computing of CT, to study CT parallel computing scheme is necessary. Image segmentation technology is widely used in clinical medical, industrial, and space science. In these applications, the result of segmentation accurate is very important, and some time-consuming segment methods are necessarily used in order to get precise segmentation result. Parallel computing is applied to segmentation to accelerate image segmentation procedure.A valid TVM-SA-POCS exterior CT reconstruction method has the advantage of numerical stabilize and can obtain high quality reconstruction image. But like the other iterative image reconstruction methods, time-consuming is its bottle neck. This exterior CT image reconstruction method could not be used in practical application unless there is a breakthrough to overcome its drawback of computing costly. For advantages in practical image reconstruction of TVM-SA-POCS, in this paper, CUDA high performance computing technology is used as accelerating solution to remedy its time-consuming reconstruction procedure. There are lots of studies on accelerating image reconstruction by domestic scholar and foreign experts. Some speedup CT image reconstruction schemes are given by these researches. Two aspects are revolved of these solutions; one way is to revise or improve the image reconstruction schedule itself so that it can reduce computation-costs, and the other way is to implement these image reconstruction methods by utilizing various parallel computing hardwares to reduce reconstruction time. The first way is called soft method, and the second is called hardware method. The application special integrated circuit (ASIC), field programmable logic array (FPGA), graphic process unit (GPU) and personal computer cluster (PC Cluster) are introduced in hardware method as parallel computing device. ASIC and PC Cluster are costly compared with the other equipments so that small company and institute could not afford them. FPGA and GPU are easily got and have high performance of computation. GPU and FPGA are low in price, but GPU has higher parallel computation and more memory for data storage and transferring compare to FPGA, so GPU has more advantage in accelerating CT image reconstruction and image segmentation application. Thus, a lot of focuses on accelerating reconstruction and segmentation by GPU are studied by many researchers. There are more merits of using GPU accelerating reconstruction and segmentation algorithms than others we have mentioned above: the updating for it is fast and its parallel computing ability increasing ratio is almost three times of Moore’s law; many faster new GPUs of general computing become available with low price; it is easy programmable with its low study curve and the easily program from last device to earlier ones. CUDA has become an important tool to accelerating industrial CT image reconstruction and image segmentation, and more and more example are available.CUDA offers a unified hardware and software solution for parallel computing on CUDA-enabled NVIDIA GPUs supporting the standard C programming language together with high performance computing numerical libraries. In this paper, we have analyzed the characters of exterior industrial CT numerical image reconstruction method TVM-SA-POCS, C-V image segmentation algorithm and CUDA-enabled NVIDIA GPUs device in order to make a better adaption to get higher speedup result. We put our effort to increase computing density with flexible scheme on CUDA device and decrease data transfer delay so as to promote data transfer efficient, also, the sharing memory technology of CUDA is got into consideration. The method of parallel reduction sum is used for accelerating vector dot product and L2 norm. From above we can conclude that what we have done is mainly to increase computing density and make data transfer efficiently. The experiments show that CUDA accelerate exterior industrial CT image reconstruction almost up to 20 times compared with CPU without image quality loss; and CUDA accelerate C-V image segmentation method almost up to 30~40 times compared with CPU with the same segmentation result each other. These experiment results show that CUDA could speedup exterior industrial CT image reconstruction and C-V image segmentation efficiently and without quality loss.