Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems

Frequent Itemsets Mining Algorithms

Author KouXiangXia
Tutor RenYongGong
School Liaoning Normal University
Course Computer Software and Theory
Keywords Frequent Itemsets Pre-FIUT Data Streams FIUT-Stream
CLC TP311.13
Type Master's thesis
Year 2012
Downloads 5
Quotes 0
Download Dissertation

With the rapid development of information technology, especially the wide spread of database technology and applications, people are confronting with fast-expanding ocean of data. In order to effectively take advantage of these rich massive amounts of data to serve mankind, people invent data mining technology.Association rules as a branch of data mining has become a research hotspot in recent years. Association rule mining consists of two steps: Firstly, find out all the frequent itemsets; Secondly, generate strong association rules from these frequent itemsets.Identifying all the frequent itemsets is the most fundamental and important issue in this two steps, has been an important part in the field of data mining during the recent years. This paper firstly introduces the basic concept of frequent itemsets mining, and classical frequent itemsets mining algorithms, then has further research in the area of incremental mining and data streams mining in-depth study.The specific research works are as follows:Firstly, the incremental frequent itemset mining, the most algorithms are based on the FP-tree, this paper proposes the Pre-FIUT algorithm, introducing the frequent items ultrametric tree structure, improved frequent itemsets mining efficiency; We propose the structure of pre-large itemsets to incrementally mine frequent itemsets based on the concept of pre-largeitemsets.Due to the properties of pre-large concepts,the proposed approach does not need to rescan the original database until a number of new transactions have been insert. Experimental results also show that the proposed approach can scan and update data quickly, use memory much more reasonable, and get precisely frequent itemsets.Secondly, frequent itemsets mining over data streams, according to data streams characteristics, the article proposed a new data streams frequent itemsets mining algorithm--FIUT-Stream. In the FIUT-Stream algorithm the data in sliding window are compressed into the BitTable, by dividing the data into equal length blocks of data to update the sliding window, using FIUT algorithm for frequent itemsets mining to obtain accurate frequent itemsets in data streams. The experiments show that the FIUT-Stream algorithm is relatively balanced in time and space.In this paper these two mining algorithms all belong to frequent itemsets mining algorithms. These algorithms based on batch mining algorithms, according incremental mining and data streams characteristics to improve the FIUT algorithm, further more improve the frequent itemsets mining efficiency.

Related Dissertations
More Dissertations