Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems

Web Usage Mining and the Research of Personalized Recommendation

Author WangYong
Tutor LiuJianPing
School Zhejiang University of Technology
Course Applied Computer Technology
Keywords Data Mining Web usage mining Personalized recommendation Apriori algorithm K-means algorithm
CLC TP311.13
Type Master's thesis
Year 2011
Downloads 64
Quotes 1
Download Dissertation

Data mining is computer science, artificial intelligence and database research direction is an important issue , it is from a large , incomplete , noisy , fuzzy , the practical application of random data , extracting implicit in them , people not known in advance , but is potentially useful information and knowledge. Web pages contain complex , unstructured , dynamic data, how vast amounts of information on the Web to analyze , for the user's needs, providing personalized recommendation service and is today an important data mining application . This paper summarizes the results of previous studies based on Web usage mining carried out for the research, the main contents are summarized as follows : ( a ) on the basic theoretical knowledge of data mining and classification for the overall study , a detailed analysis of the data source for Web usage mining , the basic process of data preprocessing . ( 2 ) on the association rules related theories in detail, analyzing the classical Apriori algorithm performance, it has been improved. Candidate sets generated in the natural connection before first conduct a pruning process, reducing the number of participating itemsets connections , thus reducing the size of candidate itemsets generated , reducing the number of loop iterations and run time , while the connection determining step to reduce excess judgment times. ( 3 ) a detailed description of the K-means clustering algorithm is the basic idea and process , analyze its strengths and weaknesses , we propose a modified K-means algorithm, that MFA algorithm . For K-means clustering algorithm, the center determined after each adjustment new cluster center requires a lot of distance calculations , proposed a change in the use of cluster centers information to determine the center of the new cluster approach , through centralization selected from the dynamic cluster candidate set filtering method reduces the computational complexity. ( 4 ) on the campus Web site log data analysis and processing , the use of improved mining algorithms for data mining , find the user's access patterns , and finally the use of mining results , to the site to add personalized recommendation feature , the initiative may be of interest for users recommend their information .

Related Dissertations
More Dissertations