Design and Implements of WSD System Based on Chinese Real Text
|Course||Applied Computer Technology|
|Keywords||Natural Language Processing Word Sense Disambiguation Polysemy Real texts HowNet Relationship|
In Natural Language Processing (NLP), word sense disambiguation (Word Sense Disambiguation) has been the focus and difficulty of the study has important theoretical and practical significance of language information processing tasks. Word sense disambiguation task is an \speech recognition, natural language processing system. The main work of this paper is to study to get the support of knowledge of word sense disambiguation, and on this basis to establish a real text-oriented the substantival Chinese word sense disambiguation system. Mainly from the following aspects: 1. Word Sense Disambiguation knowledge: word sense disambiguation acquisition of knowledge is a key problem of word sense disambiguation. Based on the analysis of word sense disambiguation required knowledge and knowledge resources knowledge, this paper studies the automatic acquisition of knowledge. And online knowledge system \support for disambiguation Knowledge Base. 2. Word sense disambiguation system design and implementation: Get syntactic information, frequency information, the role relationships between concepts, with information and word association context semantics (clustering), and choose to limit the information proposed a word sense disambiguation model news text corpus and extract the 56,000 word times, try to word sense disambiguation on which the content words (nouns, verbs, adjectives). The model includes the part of speech filter, local analyzers and match instance library. On this basis, we carried out the design and implementation of the system. 3. Word sense disambiguation system evaluation: news text corpus as an evaluation corpus, corpus aspects related to politics, sports, agriculture, science and technology. The text corpora First after Shanxi University segmentation and POS tagging system for processing, and then our system as the input object. The evaluation results show that the disambiguation effective system for Chinese real text disambiguation accuracy rate can reach 80%. In this paper, knowledge acquisition word sense disambiguation model to establish a multilingual knowledge combined experimental results show that good results disambiguation.