Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Retrieval machine

PageRank Algorithm Based on Chinese Research and Application of Vertical Search Engine

Author YangChen
Tutor YanBo; SunYong
School University of Electronic Science and Technology
Course Software Engineering
Keywords Topic Search PageRank Sorting Algorithm Elimination of Web Pages User Feedback
CLC TP391.3
Type Master's thesis
Year 2012
Downloads 29
Quotes 0
Download Dissertation

With the advent of the information age, more and more information are on theInternet. It is difficult for people to access information on the Internet, so search enginesare a tool for people to find information.it often returns millions of results whenInternet users use the general search engines to find information.It is more and moredifficult for people to find the information which they want, and the vertical searchengine are a tool to solve the problem.Firstly, the vertical search engine architecture and key technology are researched inthis paper, then data preprocessing and sorting algorithms for vertical search engines arestudied in detail. Pages pretreatment and similar pages to removing technology are thetwo techniques of Data processing techniques. The paper first described the pagepretreatment model, and then analyze the problems in this model, then put forwardimprovement program to improve the model, adding the impact factor and the searchword synonym expansion in this model. Then, introducing the algorithm of removingduplicate pages, then analyzing the shortcomings of these algorithms, this paperpresents a vertical search engine for Chinese agriculture combined with thesegmentation based on single MD5and MD5digital signature de-emphasis algorithm.The issues of search engine users are not satisfied with the sort your search results,this paper first present classic sorting algorithms and analyze their advantages anddisadvantages. To solve the problem of these algorithms, a sorting algorithm based onPageRank algorithm, this algorithm is first to improve the PageRank algorithm, addedto the database factor and time feedback factor, then join the Lucene score sorting factorin the algorithm, user feedback factor and the site level sorting factor, this algorithm canmeet the needs of agriculture vertical search engine.The end of this paper developed a Chinese Agricultural vertical search engine, Itis proved that the algorithm can meet the demand of the Chinese theme search engine inrecall rate, accuracy and response time.

Related Dissertations
More Dissertations