Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Retrieval machine

LUCENE2.0 source code search engine architecture - based implementation

Author LuoMei
Tutor ZhangYuan
School Northwestern Polytechnical University
Course Software Engineering
Keywords Search engine Lucene Chinese word segmentation Thread Liferay Portal
CLC TP391.3
Type Master's thesis
Year 2007
Downloads 692
Quotes 3
Download Dissertation

Flourish in the Internet today , the information on the Internet is voluminous . People enjoy the convenience of the Internet at the same time , is also facing a problem , how huge information accurately and quickly find the information they need , which the Internet search engine came into being . Web search engine technology is becoming the computer science community and information industry competing research and development of object . A search engine is a web site on the Internet dedicated to provide inquiry services , these sites through the Web search software or website landing page to collect a large number of sites on the Internet , building a database after processing , enabling the user each kinds of inquiries , respond to , and provide the information required by the user . In this paper, the open source Lucene engine architecture designed and implemented a reusable , scalable search engine system Hicode, can be used to specifically search the web and local data in a programming language source code files , effectively positioning the user needs the location of a certain period of the program source code and its source file . This paper is the first to use open source Lucene search engine system Hicode tools . Then use the Java technology that reptiles , index and search the realization of three core part of the search engine . Reptilian part of Java multi - thread mechanism , using a thread pool to manage multiple crawl threads , concurrent crawl the web . Index and search part Lucene engine architecture to achieve a more effective Chinese Chinese word Lucene custom word , and also the introduction of serialization and JavaCC to improve the index of efficiency and development efficiency . Finally, a source code search engine integrated into the Liferay portal provides a user interaction interface .

Related Dissertations
More Dissertations