To Detect Gene-Gene Interactions Based on Swarm Intelligence and Collision Avoidance Strategy and Its Parallel Computing
|School||National University of Defense Science and Technology|
|Course||Computer Science and Technology|
|Keywords||Gene - gene interactions Single nucleotide polymorphism Swarm Intelligence Conflict avoidance strategy Java platform parallelism Multi-core processors|
Bioinformatics is an emerging discipline, formed by the intersection of life science and information science to efficiently mining the biological significance of biological experimental data. Gene - gene interaction studies is the genome bioinformatics analysis of an important issue that has important implications for the study of complex disease etiology. This gene - gene interaction research focus in the genome-wide interaction detection, two main challenges, First genome-wide data on the interactions detected encounter intensive computing burden is interaction detection will be subject to a marginal role. Single nucleotide polymorphisms (single nucleotide polymorphism, SNP), because of its widely present in the genome as a genetic marker, and easy to measure, is often used as the object of study of genetic association studies. Some of the pitfalls of existing algorithms for gene - gene association studies, the two kinds of ideas to solve the problems of the original algorithm parallelization algorithms to study proposed a more efficient and fast algorithm, the second is improvement using parallel computing platform to accelerate the algorithm. The two kinds of ideas, innovation, mainly in two aspects. One gene - gene interaction detection algorithm based on swarm intelligence and conflict avoidance strategy. First in the field of two kinds of random algorithm - ant colony optimization algorithms and SNPHarvester algorithm has been improved to overcome the defects of the two kinds of algorithms then combine the two kinds of improved algorithm, and conflict avoidance strategy rational allocation of search resources obtained can be applied to a genome-wide, from the marginal effects of the gene - gene interaction detection algorithm. The algorithm for the study, the SNP SNP group selected from a large number of SNP with a significant interaction. The algorithms initialize multiple SNP group as the initial value, and generate multiple search paths, by local extremum Search the protection of high-level interaction. The ants used between the probability density for communication. Through the use of conflict avoidance strategy to reduce the paths cross and overlap, the resulting solution can be more broadly reflect the gene - gene interactions in genome-wide distribution. Confirmed in experiments on simulated data and real data, the algorithm and SNPHarvester algorithm in statistical capacity compared to the obvious advantages in efficiency, the results obtained can be broadly representative of the gene - gene interactions in genome distribution. The second is a parallel algorithm based on the Java platform. Parallel algorithm for the personal computer hardware environment, and take full advantage of multi-core processors, open up multiple threads, parallel speedup of the original algorithm. Parallel program to verify the result of the comparison of the original program, the algorithm has good scalability to the GPU (graphics accelerator), portable cluster computer and supercomputer platform, also shows Java in the development of bio-data processing parallel programs great potential.