Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems

Research on Data Stream Frequent Itemsets Mining

Author ZhengXueShuang
Tutor HuangHouKuan
School Beijing Jiaotong University
Course Computer Applications
Keywords Data stream Data Mining Data Stream Mining Frequent itemsets
CLC TP311.13
Type Master's thesis
Year 2007
Downloads 305
Quotes 2
Download Dissertation

The data stream is a time- arrival of the set of items in an orderly way . And traditional static database data is different, the data stream is continuous , unlimited, usually to the arrival of the high speed and the data distribution is changed with time . Makes traditional frequent itemset mining algorithms are difficult to apply due to the characteristics of the data stream . Many researchers studied data stream mining frequent item sets . Currently, frequent itemsets mining data stream has become one of the basic problems in data mining . The characteristics of the data stream , the research and papers on key issues in data stream processing techniques and data stream mining . Research on some of the key issues to resolve technical papers . Papers on classic frequent itemset mining algorithms are introduced and experiments . Unlimited data stream can be seen through experiments and analysis of high - speed makes the classic frequent itemset mining algorithms are difficult to apply to the data stream . In addition, the paper introduced frequent itemsets algorithm for the current existing data flow analysis and summary. Finally the algorithms FP-CountMin . The algorithm to the data stream segments and take advantage of improved FP-growth algorithm for mining frequent sub- itemsets . Then , the Count Min Sketch itemset counting . Algorithm to solve the problem of fast and efficient compression statistical and computational . Through the experimental comparison algorithm and FP-DS , FP-CountMin algorithm has better time efficiency .

Related Dissertations
More Dissertations