Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Retrieval machine

Study on the Evaluation of Performance of Search Engines’ Features

Author FeiWei
Tutor PengZuoZhang;ZhangJin
School Wuhan University
Course Library science
Keywords Search Engine Information retrieval Evaluate Correlation Sequence Optimization
CLC TP391.3
Type PhD thesis
Year 2010
Downloads 1823
Quotes 4
Download Dissertation

Evaluation of the search engine is a hot research field of information retrieval, one of the network information, and information retrieval technology to promote the development of search engine development of the practice. In order to meet the growing information needs of the user, in addition to improving its search engine simple search function, but also continue to develop advanced search function. The search function is designed to help users to access high-quality network information, but their retrieval performance is not known. In this paper, the search engine relevance of search results and sorting quality as two core evaluation of the current major search engines the main search function were evaluated. The results of research on the one hand can help users in the use of search engines for information retrieval when selecting an appropriate search strategy, on the other hand can know the different search functions for search engines performance. In the first chapter, the author discusses the recent years, the search engine and its evaluation of the status quo. In the extensive literature on the basis of the contents of the study, methods, features, and inadequate trends are summarized. Evaluation of the current major search engines to research as the core content related to experimental methods, survey methods, data analysis, observation, review and comment as the main research methods, with a dependent, dynamic, diverse, attention to user participation and so on. However, evaluation of search engines as well as insufficient, mainly due to the lack of different search feature comparison between the retrieval efficiency and the quality of search results Sort evaluation and so on. With the development of multimedia information, multimedia retrieval functions for search engine evaluation will become a hot spot for future research. In the second chapter, the author pointed out that the correlation is a search engine evaluation based on indicators, and thus derived from the quality of search results Sort this indicator, the correlation based on the form and content of the web be judged, the quality of the search results sorted The results of the ranking and sorting stability decision. Around these two core indicators, the author constructs a set of evaluation system and in accordance with certain criteria selected five in English and five kinds of search engine retrieval functions as the research object. English search engines Google, Yahoo and MSN / Live / Bing, Chinese search engine Baidu and Google, five kinds of search functions are Title search, phrase search, PDF search, URL retrieval and general searches, in which ordinary retrieval as a comparative analysis benchmarks. In the third chapter, the author presents the content of the study hypothesis, and design experimental procedures. Application of AHP evaluation for correlation analyzes, from a series of indicators selected full-text search results, abstract, title, page effectiveness, length of user burden and web pages and other core indicators to measure the correlation and relevance of search results calculation method was modified, with the corrected correlation formula to measure the results of each search function overall relevance. Analysis of variance for the comparative analysis of search engine retrieval functions between each retrieval efficiency is a significant difference. If there is a significant difference, Tukey's multiple comparison test rule will be used to explore the reasons for this difference caused. Meanwhile, regression analysis to evaluate search results sort order and stability. In the fourth chapter, based on 50,000 copies of the data, the author applied the analysis of variance for the five search engines five kinds of search functions were evaluated, the results show the retrieval efficiency between the search function there was a significant difference, Tukey's multiple comparison test France proved the cause of the reasons for the differences. In each retrieval functions, PDF retrieval efficiency highest, followed by Title search, general search, phrase search and URL retrieval. The search function in the stability evaluation, the stability is better than ordinary search other search function. English search engine, Yahoo! search function in the five kinds of retrieval efficiency is higher than Google and MSN / Live / Bing, followed by Google, MSN / Live / Bing search efficiency worst. Chinese search engine, Google Title search, general search, PDF search and retrieval URL retrieval efficiency is much better than Baidu, phrase searching there was no significant difference between the two. In the fifth chapter, the author uses regression analysis curve estimation method comparative analysis of five of the five kinds of search engine results Sort quality of retrieval functions. English search engines retrieve Sort ordinary best quality, URL retrieval worst Chinese search engines, URL retrieved Sort poorest quality, Baidu's search results Sort by best quality PDF, Google's search results Sort by Title Quality best. Data show that Chinese search engine search results Sort quality compared with the English search engine has a larger gap. In the sixth chapter, the author pointed out that in the process of data collection and analysis, we found both in Chinese and English search engine retrieval efficiency or in the search results sort on, there is a big gap. Chinese search engine for existing problems, the author proposes the corresponding optimization strategies, not only to enhance the quality of Chinese construction site, but also promote the development of open access, so you can enhance the Chinese from the source network resource quality. Search engines should have a powerful message filtering capabilities, while adopting some caution anthropogenic interference with business practices sorting of search results.

Related Dissertations
More Dissertations