Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing

Research on Hypergraph Partition for Coreference Resolution

Author ChenZuoPeng
Tutor WangYuYing
School Harbin Institute of Technology
Course Computer Science and Technology
Keywords Coreference Resolution Hypergraph modeling k way partition Iterator2way partition Hyperedge weight learning
CLC TP391.1
Type Master's thesis
Year 2012
Downloads 38
Quotes 0
Download Dissertation

Coreference resolution is the task of grouping mentions of entities into sets sothat all mentions in one set refer to the same entity. Coreference resolution is animportance subtask of information extraction, has important application in varviousfields of natural language processing and information retrieval.Most recent approaches to coreference resolution divide this task into two step-s:1)a classification process to determines whether a pair mention pair is coreferenceor the confidence of coreference pair;2)a clustering process to groups mentions intoentities based on the output of step1.In this paper, we use the hypergraph partitioncleverly solve the coreference resolution problem, to avoidthe the division into twosteps, but from a global optimization perspective,resolve coreference resolution.Werepresent a document as a hypergraph, where the vertices denote mentions and theedges denote relational features between mentions.Coreference resolution from theperspective of global optimization, will exceed the diagram is divided into multipleindependent subgraph,each subgraph represents an entity chain.This paper focuses on three parts: the hypergraph modeling for coreference re-solution; hypergraph partition for coreference resolution; hyperedge weight learningbased on unlabel corpus.Coreference resolution hypergraph modeling is mainly concentrated in the hyp-ergraph modeling for coreference resolution,include the hyperedge and hyperedgeweight. We have introduced a negative connection,strongly connected, twosuperedge concepts,depicting the links between entities.We proposed two hyperedgeweight learning methods.Hypergraph partition for coreference resolution refer to convert coreference r-esolution to hypergraph partition. From the k way hypergraph partition and iterator2way hypergraph partitioning to solve the coreference resolution problem.k wayhypergraph partition optimization hypergraph cutting lossed,by making sure thenumber of subgraph to determine the number of entity chain.Iterator2way partition to optimize the tolerance of the partition,and gradually split subgraph to determinethe entity chain.In the ACE05Chinese corpus,and traditional methods proved theeffecttiveness of our work.Hyperedge weight learning based unlabel corpus,we convert the supervisedmethod to unsupervised one.We proposed the method based head word match andmethod based word association.The word association is an effective feature for co-reference resolution,its high word association refer to coreference.Limited by thesparsity and computation complexity,word association failed to effectively apply thecoreference resolution.In this paper,we use it to train the hyperedge weight onunlable corpus,and the experiment results show that our method is comparable tounsupervised method,and have better transplantation.

Related Dissertations
More Dissertations