Transaction-based data table technology research association rule mining
|Course||Pattern Recognition and Intelligent Systems|
|Keywords||Association rules Complete frequent itemsets Frequent Closed Itemsets Maximal Frequent Itemsets Binding Pruning techniques TD deal|
Data mining association rule mining is a very important branch of study , difficulties in its massive data mining objects . Apriori algorithm requires multiple scans of the database , in real mining massive databases difficult practical ; FP-Growth algorithm relative to the Apriori algorithm to improve the efficiency of an order of magnitude , but the memory consumption, database implementation in the mass level , there are also difficulties . The current domestic and international research literature many association rules , most focused on the improvement of the two algorithms . This paper studies how to calculate from the known transactional database of its corresponding frequent item sets and how to generate the maximal frequent itemsets of association rules validation . In this paper, frequent itemsets mining classification was proposed based on the transaction data sheet TD frequent itemsets mining algorithms are used to generate complete frequent item sets , frequent closed itemsets and maximal frequent itemsets . Algorithm in the entire mining process , only need to scan a transaction database . By the maximal frequent itemsets generated association rules , it may produce a large number of redundant rules , which allows users to analyze and utilize these rules become very difficult. In this paper, a variety of existing association rules pruning techniques are studied , they found problems , proposed that binding as a new pruning techniques . Will be based on the transaction data sheet TD frequent itemsets mining algorithm is applied Mushroom database frequent itemsets mining , and through the analysis shows that the proposed algorithm is based on the transaction data sheet TD frequent itemsets mining algorithm algorithm execution time and space consumption is superior to FP-Growth algorithm .