Research on Hypergraph Partition for Coreference Resolution |
|
Author | ChenZuoPeng |
Tutor | WangYuYing |
School | Harbin Institute of Technology |
Course | Computer Science and Technology |
Keywords | Coreference Resolution Hypergraph modeling k way partition Iterator2way partition Hyperedge weight learning |
CLC | TP391.1 |
Type | Master's thesis |
Year | 2012 |
Downloads | 38 |
Quotes | 0 |
Coreference resolution is the task of grouping mentions of entities into sets sothat all mentions in one set refer to the same entity. Coreference resolution is animportance subtask of information extraction, has important application in varviousfields of natural language processing and information retrieval.Most recent approaches to coreference resolution divide this task into two step-s:1)a classification process to determines whether a pair mention pair is coreferenceor the confidence of coreference pair;2)a clustering process to groups mentions intoentities based on the output of step1.In this paper, we use the hypergraph partitioncleverly solve the coreference resolution problem, to avoidthe the division into twosteps, but from a global optimization perspective,resolve coreference resolution.Werepresent a document as a hypergraph, where the vertices denote mentions and theedges denote relational features between mentions.Coreference resolution from theperspective of global optimization, will exceed the diagram is divided into multipleindependent subgraph,each subgraph represents an entity chain.This paper focuses on three parts: the hypergraph modeling for coreference re-solution; hypergraph partition for coreference resolution; hyperedge weight learningbased on unlabel corpus.Coreference resolution hypergraph modeling is mainly concentrated in the hyp-ergraph modeling for coreference resolution,include the hyperedge and hyperedgeweight. We have introduced a negative connection,strongly connected, twosuperedge concepts,depicting the links between entities.We proposed two hyperedgeweight learning methods.Hypergraph partition for coreference resolution refer to convert coreference r-esolution to hypergraph partition. From the k way hypergraph partition and iterator2way hypergraph partitioning to solve the coreference resolution problem.k wayhypergraph partition optimization hypergraph cutting lossed,by making sure thenumber of subgraph to determine the number of entity chain.Iterator2way partition to optimize the tolerance of the partition,and gradually split subgraph to determinethe entity chain.In the ACE05Chinese corpus,and traditional methods proved theeffecttiveness of our work.Hyperedge weight learning based unlabel corpus,we convert the supervisedmethod to unsupervised one.We proposed the method based head word match andmethod based word association.The word association is an effective feature for co-reference resolution,its high word association refer to coreference.Limited by thesparsity and computation complexity,word association failed to effectively apply thecoreference resolution.In this paper,we use it to train the hyperedge weight onunlable corpus,and the experiment results show that our method is comparable tounsupervised method,and have better transplantation.