The field of sports news - oriented Chinese simple noun phrase coreference resolution |
|
Author | ChenZuoYang |
Tutor | WangShuMei;HuangHeYan |
School | Nanjing University of Technology and Engineering |
Course | Computer Software and Theory |
Keywords | Coreference resolution Statistical learning methods Decision Tree Semantic class features Referred to characteristics |
CLC | TP391.1 |
Type | Master's thesis |
Year | 2009 |
Downloads | 43 |
Quotes | 0 |
Coreference is a common phenomenon in natural language chapter , it makes the language more concise , more coherent sentence , but this is also a computer to understand natural language to increase the difficulty . This article is mainly for Chinese coreference to digestion related technologies, based on the decision tree algorithm focuses on how to use the various features coreference resolution , the main work is as follows: 1 , defines the characteristics of the noun phrase in the field of sports news , the simple noun phrase , and the use of coreference resolution based on the identification of the parts of speech rules , handling objects ; unified coreference resolution training and test corpus annotation standards and design tagger coreference resolution training examples . 2 , analysis and comparison of the based clauses distance features based on the sentence the distance characteristics and characteristics based the clauses distance features coreference resolution , experiments show that the introduction of little change in the recall rate can be significantly improve coreference resolution accuracy , the experimental results show that the accuracy is improved by 4.08% . In access to a large number of domestic and foreign total digestion semantic attribute characteristics hyponymy semantic relations calculated on the basis of semantic class features . Defined names, organization names three shallow semantic application belongs to the semantic category of the semantics of the noun phrase simple division . Introduction of the feature , compared with other features experimental results show that the of hyponymy semantic class characteristics relative to other features better support the classification of coreference discriminant , F value 8.54% . Explore the characteristics of the Chinese referred to and referred to the category of fixed , for noun phrases contents , summarized by the previous rules on the classification of Chinese referred , referred to as rule-based Chinese characteristics . 5 , based on the decision tree to achieve a field of sports news - oriented Chinese coreference resolution experimental system .