Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Retrieval machine

Research and Implementation of Vertical Search Engine

Author GouZhiZuo
Tutor GaoKai; ZhangWuHong
School Hebei University of Science and Technology
Course Computer technology
Keywords vertical search engine system crawling Web spider extracting on thestructure of information creating Index sorting mechanism
CLC TP391.3
Type Master's thesis
Year 2012
Downloads 53
Quotes 0
Download Dissertation

With the rapid development of Internet technology, computer technology and thepopularization of personal computer, the channels by which people get information aregradually becoming extensive. Among so many sources of getting information, theinfluence of the Internet is getting larger and larger, so acquiring information on theInternet becomes one of the main ways for people to access information. In the case ofWeb information to the growth in geometric progression, the service provided bytraditional search engines has failed to meet the needs of users. More and more users havegreater demand to the intelligence, humanization of search engine system. People hopethat the search results will be more accurate, more in line with their own needs. These newdemand has put forward higher requirements to the search engine technology. Therefore,the vertical search engine technology is born out against this background.This paper studies the application of vertical search engine technology. First itanalyzes the characteristics and working principle of vertical search engine. Then, thispaper focuses on the internal structure and operating mechanism of Heritrix, which is atool that can crawl web spider, and then uses Heritrix to crawl web content. At the sametime, it extracts on the structure of the crawled information, and stores related web content.Also, this thesis studies the internal structure and operating mechanism of Lucene, then ituses Lucene to establish the index system. It studies the sorting mechanism of Lucene, andoptimizes the sorting results. It completes a whole vertical search engine system, andanalyzes the results. For designing vertical search engine system of corporate website, thispaper provides a practical significance of reference.

Related Dissertations
More Dissertations