Study and Application of Clickstream Data Warehouse in E-commerce |
|
Author | XuZuo |
Tutor | XieWenGe |
School | Liaoning University of Technology |
Course | Applied Computer Technology |
Keywords | Data Warehouse web log click stream data implicit association page userclustering |
CLC | TP311.13 |
Type | Master's thesis |
Year | 2014 |
Downloads | 22 |
Quotes | 0 |
With the development of database technology, the productivity of the enterprise issubstantially enhanced. Widely used in the world, database makes dramatic growth of datastore. Large amounts of data are stored in enterprise. However, it can’t be converted toinformation effectively, so we have to face the "data-rich and information-poor" situation.This situation makes investments in databases by an enterprise could not be converted toearnings. The Data Warehouse can store large amounts of historical data. Its appearance is agreat solution to this problem. Traditional Data Warehouse just load data from various typesof business. With the development of Internet, web data is increasingly becoming moreimportant. In these data, Web logs are very important data. It can help policy makersunderstand user habits, and then make a targeted deployment. This article is in such a context,building a click stream data warehouse, implementing user clustering algorithm based onimplicit association page, and describes how to apply the user clustering algorithm intoe-commerce.This click stream Data Warehouse built-in the background environment for e-commerceapplications by Web logs which are important data sources. Promoted by Inmon DataWarehouse designing model is Data Warehouse+Dependent Data Market. Data Warehouseuse relational model and dimension Data Market build by dimension model. Data Warehouse,which is stored in the third normal form, has a lot of history and low-grain business data. Asdata basis, it helps business managers make decisions. Dependent Data Market structure baseson user demanding. Data Warehouse+Dependent Data Market schema model can be wellbalanced access to the efficiency and flexibility of structural adjustment. Based on the clickstream Data Warehouse, this paper gives a vector-based click stream user clustering algorithm.Algorithms map user’s click stream data to vector data. Depending on the size of the anglebetween the vectors, we determine the similarity between user’s data. Associated pages andpage group found by implicitly associated mining algorithms are acted as a vector ofdimension. Implicitly associated page can reflect users’ access habits very well, betterhighlight topics of interest.Algorithms of paper are tested in the experimental data warehouse. Experiment provesthat the algorithm can effectively identify the user target page and the implicit associationpages. It could found more than2association pages. User clustering can be better suited to thecomplexity of the Internet environment.