Based upon digestion items identified global optimization method coreference resolution |
|
Author | QiShuHan |
Tutor | WangXuan |
School | Harbin Institute of Technology |
Course | Computer Science and Technology |
Keywords | Coreference resolution Digestion items to be identified Global Optimization Integer Linear Programming |
CLC | TP391.1 |
Type | Master's thesis |
Year | 2011 |
Downloads | 26 |
Quotes | 0 |
This paper studies the digestion items to be identified and through global optimization methods to improve coreference resolution. By results of feedback and parameter adjustment, coreference resolution systems with different properties can be combined with digestion item recognition classifier produces different coreference resolution performance. This article will be digested items identified as a classification problem, build a model using the maximum entropy method, for training and classification. By using the maximum entropy model and the 70 characteristics, this paper constructs a digestion item classifier. Items in building classifiers digestion process, this paper presents a method of parameter adjustment by setting corpus scale and probability threshold two parameters, can be identified digestion items, select optimal digestion items classifier classifier. Coreference resolution before performing the work, will be digested items identified as a filter capable of removing a large number of non-digested items. This article uses the maximum entropy model and 65 features constitute the benchmark coreference resolution systems. These features include parts of speech, syntax, semantics, syntax, morphology, such as information about various aspects of linguistics. In coreference resolution before the term classifier using digestion participation phrase coreference resolution to identify, filter out non-digested items. Dispelling items due to different performance benchmark system combines classifier and can produce different results, so this article from a global point of view, through the corpus and the probability threshold ratio of these two parameters on the digestion items classifier adjusted so that system performance coreference resolution optimal. This paper also studied another global optimization approach: using an integer linear programming (ILP) for coreference resolution for global optimization. The coreference resolution problem as an optimization problem, the introduction of linear programming method, the maximum entropy model output for further global optimization. This paper proposes a transmission constraint relaxation in the application of linear programming time as a viable domain constraints. In this paper, some experimental results on the basis of the system, adding digestion items classifier coreference resolution systems, the use of global optimization ILP coreference resolution systems and Ng Soon both classic and coreference resolution systems were compared. In the evaluation of the experimental results, the article uses the MUC, B3, CEAF, BLANC other evaluation criteria coreference resolution systems for evaluation, and the use of these evaluation criteria, the average F value coreference resolution as the ultimate measure of performance. This paper also studied the use of different parameters to be identified digestion items coreference resolution systems on the overall impact. Experimental results show that use of the term to be digested coreference resolution systems identified in the consolidated results superior to other participating comparison system. In the reference system to be applied on the basis of digestion items identified, coreference resolution by the 50.57% average F value raised to 53.35%.