Research on Structural Learning and Inference in Bayesian Networks
|School||Xi'an University of Electronic Science and Technology|
|Keywords||Bayesian Networks Structural learning Conditional Independencetest Junction Tree Conditional Linear Gaussian networks Maximal PrimeDecomposition Artificial Bee Colony optimization Markov Boundary Markov equivalence Inference|
Bayesian network (BN) is a general graphical model which is developed by theintegration of probability with graph theory. Since it is based on solid theoreticalfoundation and has unique advantages in encoding and reasoning uncertain knowledge.It has been successfully applied to wide range of tasks, such as machine learning,artificial intelligence, bioinformatics, economic analysis and the forecast. Usually,constructing a Bayesian network only by the domain expert is difficult, even impossible.Therefore, structural learning and inference in Bayesian networks from data havebecome key and difficult points in this research field. After analyzing relevant theoriesof Bayesian network, this paper focuses on the structure learning of Bayesian networks,and establishes a systematic theory and optimization method from different angles. Allof these may provide advantageous basis for construction and application of Bayesiannetworks. And, we propose a new algorithm for inference in conditional linear Gaussian(CLG) Bayesian networks based on the specific characteristics of such model. The mainworks can be summarized as follows:1. In order to learn the equivalence classes of large scale Bayesian networks, anew algorithm is developed based on the maximal prime decomposition. Firstly, weintroduce a method to decompose BN into its maximal prime sub-graphs by the fullconditional independence information encoded in the moral graph, which is stronglyinspired by the graph decomposition theory. In such a way, a complex high-dimensionaldata is reduced into several small data sets. Then, a problem of searching for the wholestructure can be split into the recovery of local causal relationships. We theoreticallyproved that the local statistics information of each variable can not be destroyed.Meanwhile, the Markov boundary of each variable can be built under the embeddedfaithfulness condition, which improves the power of conditional independence tests.2. To solve the drawbacks of learning Bayesian networks from small sample data,this paper proposes an optimization approach based on a prior node ordering. It firstdefines the1-step dependence coefficients to estimate the dependence among variables,with which it constructs the objective function as an overall measure of the dependencefor a graph structure. Therefore, a problem of searching for the structure can be turnedinto finding the maximum value of the objective function in feasible fields, which offersa new opinion for the research of extended Bayesian networks. In addition, we have proved the existence and uniqueness of the numerical solution. Moreover, we developedthis idea further to the data classification without node ordering. The theoretical andexperimental results show that the new classifier has a better structure and performancecompared to the naive Bayes and Tree-augmented naive Bayes.3. A hybrid algorithm that combines the search-and-score method with a recentlyintroduced meta-heuristic—Artificial Bee Colony optimization, is introduced forlearning Bayesian networks. We first describe the solution space and the generation ofinitial solution to tackle our learning problem. Then, bees search for a neighboringsolution depending on the pheromone calculated by the K2score function and threeoperators. After that, different types of bees collaborate and share their information untila termination criterion is met. Simulation results show that the new algorithm not onlyhas better learning ability and faster convergence speed, but also better solution qualitythan the other two state-of-the-art algorithms reported in the literature.4. According to the study of conditional linear CLG networks, this paper makes adeep analysis for propagation in CLG Bayesian networks, and further proves someimportant operators which can be used to eliminate variables during message passingphase. Meanwhile, a propagation algorithm for probabilistic inference in CLG Bayesiannetworks based on the strong junction tree is presented. Firstly, by performing semanticmodeling before physical computation, the proposed algorithm decomposes thepotential functions corresponding to the cliques and separators of junction tree, andgives the target probability distribution before and after variable elimination bytransforming the calculation into a series of formula deductions. Thus, iteration betweensemantic modeling and physical computation can be avoided.