Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems

Based on Web Log Mining Research and implementation of user

Author LingXiaoQin
Tutor SongBin
School Nanjing University of Technology and Engineering
Course Computer Applications
Keywords analysis of users’ behaviors ant colony algorithm Web logs maximizing deviation weighting algorithm
CLC TP311.13
Type Master's thesis
Year 2011
Downloads 68
Quotes 0
Download Dissertation

With the global popularity of the Web, the amount of information on Web proliferates. The information extracted from the Web pages by different types of user groups are certainly distinct.Mining through the Web logs, along with clustering on the behaviors of users effectively by clustering techniques, all that of optimizing websites, providing users with personalized services, profiting network marketing, improving network security and so force can be achieved. Therefore, mining the behaviors of users from web logs, has become a hot research topic.This paper proposes a web-log-based mining method about the behaviors of users. Mainly through preprocessing the Web logs, identification of users, identification of transactions and other steps, a list of the users’ transactions can be obtained, from which, the time of users’ access to pages, the number of the interest pages, the times of downloading and other characteristic values are extracted. It is with the improved ant heap clustering algorithm to accomplish clustering analysis of the users’ behaviors.For the ant heap behavior-based algorithm, I try to improve the traditional LF algorithm in this paper, mainly manifested in the similarity calculation of users’ behaviors. The traditional characteristic weighting algorithm typically processing the average can not fill the clustering analysis of users’ behaviors in this paper. It is with the method of maximizing deviation weighting in this paper to improve the characteristic value weighting in each dimension. Comparing to the traditional ant heap clustering algorithm, the proposed method in this paper is more suitable for clustering of high dimensional characteristic values, and more proper to reflect the patterns of users’ behaviors, which targeted for the network administrators to optimize network to meet customers’ demands for applications.

Related Dissertations
More Dissertations