Improvement and Application of the decision tree algorithm based on the
|Course||Traffic Information Engineering \u0026 Control|
|Keywords||Data Mining Decision tree ID3 algorithm Object-oriented method XML|
With the extensive application of Data Mining (DM), it’s more and more important to get correlative knowledge or rule from data, so there are all kinds of technologies about Data Mining to be put forward in the latest decades in order to solve this problem.Decision tree algorithm is one of the core technique algorithms of DM. It’s widely used in commerce. In decision tree algorithm, the famous one is ID3 algorithm, which was presented by Quinlan in 1986. It is not an algorithm increasing by degrees, and it uses information entropy as a standard to select attribute. But this algorithm has three disadvantages: the first is that is easy to select those attributes whose values is more, while attributes whose values is more are not always the best; the second is that can only deal with discrete attributes but not continuous attributes; the third is all attributes from rote node to leaf node must be known.To solve these problems, this paper introduces new algorithm to improve ID3 algorithm. Compared with the decision tree built by ID3 algorithm and new algorithm with the same example, we can know the new algorithm is better.According to the object-oriented method, this paper uses Java to actualize ID3 algorithm and improve new algorithm and makes use of new algorithm to classify the order administration in a bookshop of electronic business system. In the rules extracting phase we can easily covert the model to the classification IF-THEN rules. Besides, XML is used in the improvement and realization of ID3 algorithm, based on "XML can characterize all kinds of data, exchange different kinds of data and solve the problem of united interface".