Research and Implementation on Personalized Search Technique Based on Semantic Annotation
|Course||Computer System Architecture|
|Keywords||search engine semantic web semantic annotation inverted index word segment|
World Wide Web has changed the way people exchange, but the most existing Web content is only suitable for manual processing. Some software tools improve the way of human communication, but have some shortcomings, such as results based on frequency not semantic, high-matching but low-accuracy, low-matching or non-matching, search results highly sensitive to the vocabulary, not correct results corresponding to keyword, a single search results page. Even if the search results are successful, users must have their own browser search documents, to extract the necessary information. The reason is that the lack of information that a computer can understand.Semantic Web has been called the next generation of Web form that can effectively address the above-mentioned problems, it is an extension of the WWW, also can use nature language to perform the Web content. So, Web content can be read and used by software agent. Therefore, the Semantic Web is considered as the media of data, information and knowledge exchanging, can be used to solve the issues raised above. Semantic Web as a data representation and sharing format supplies search engine semantic data, so semantic technology can be used to implement the semantic search which makes the search results more semantic to the users’search questions.This thesis uses Semantic Web technique to build a semantic search model based on semantic annotation. The model build a domain ontology and annotates crawling related pages, then indexes the annotated resources, so that search results can meet the needs of the user’s semantics. The thesis is mainly based on the analysis, designing, researching and implementation of semantic search engine, and crawl the text of a web page, segment the text, discusses the feasibility and effectiveness of building domain ontology. Then the texts are annotated to produce metadata, the thesis generally analyzes and research its work mechanism, usage and implementation, establishes semantic index based on the existing inverted index. So the search engine can be used.Experimental tests prove the accuracy of this method and show that it enables users satisfied basically.