Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing

Research on Fast Retrieval Algorithm Chinese Expressions and Sentences Based on Chinese Corpus

Author LiShaoZuo
Tutor Wang; ZhangYuHe
School Yanshan University
Course Software Engineering
Keywords corpus full text retrieval indexing of single Chinese characters postcontrolled vocabulary search words
CLC TP391.1
Type Master's thesis
Year 2013
Downloads 1
Quotes 0
Download Dissertation

With the rapid development of China’s national economy informatization, as a naturallanguage understanding of the core areas of artificial intelligence, has received extensiveattention from all sectors of society. Based on a large number of real practical needs of text processing, the use of computer and other Chinese corpus to achieve resource sharing, retrieval and use, is now the linguistic trends. In this paper, retrieval and abroad advanced technology research guidance, combined with the current situation of Chinese corpus on using literature as an access point to retrieve the contents of a study method.First, this article on the current computer and corpus based on the development ofcomputer technology was introduced in this paper. On this basis, analyzes the Chinesecorpus applied to retrieve full-text retrieval technology, summed up its technicalcharacteristics, and pointed out the existing problems of it.Second, searches for single characters Indexing defects two improvements proposed idea. On the one hand, a single man in a retrieval system, controlled vocabularies after additional help searchers informed of the synonyms and related words, in order to improve therecall rate. On the other hand, the search matching algorithm has been improved, we propose a Chinese phrase quick retrieval algorithms. T-L conversion algorithm using the Chinese characters into the text document location index list. According to information in the table location index, quickly find the target words and phrases through L-T conversion algorithm where the sentence or contextual information feedback to the user.Finally, a brief introduction to research on fast retrieval algorithm Chineseexpresstion and sentences based on windowssystem. It include: a single charactergenerated index module, the system retrieves the module, and the search results generatedmodules and increase the search range modules. And gives the database design and systemimplementation of the basic process.

Related Dissertations
More Dissertations