Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing

Research on Chinese Coreference Resolution and Its Related Technologies

Author WangZhiQiang
Tutor ZhongYiXin
School Beijing University of Posts and Telecommunications
Course Signal and Information Processing
Keywords Coreference resolution Chinese base noun phrases Conditional Random Fields Maximum entropy Twin-candidate model
CLC TP391.1
Type PhD thesis
Year 2006
Downloads 440
Quotes 4
Download Dissertation

Coreference largely exists in discourse and dialogue in natural languages. For human understanding, it makes sentence more concise and makes topic more brilliant. However, for computers, it makes languages more difficult to process computationally. Due to this problem, coreference resolution is formed as a subdivision in natural language processing. With the increasing demand of discourse processing, it plays more and more important role in different fields such as, information extraction, machine translation, automatic summarization, question and answer systems.This dissertation researches on Chinese coreference resolution and Chinese base noun phrase (BaseNP) recognition. The main fruits are listed as follow:1. We propose a kind of rule-based Chinese BaseNP recognition algorithm, an extension of part of speech (POS) template algorithm. From statistics and analysis of words in BaseNP’s context, this algorithm, based on POS template and context information, constructs extended POS template, which is used to correct the tag results. Due to the reasonability and correctness of these templates, the precision rate achieves 94.48%.2. We propose a kind of combination of statistical and rule-based algorithm for Chinese BaseNP recognition. Motivated by the complementary role of statistical and rule-based algorithms, our algorithm first tags labels and then corrects the results using extended POS templates. The F-measure of experiments achieves 89.51%, which is higher than that of any independent method. It shows that this combination could, to some extend, compensates them each other.3. We propose machine learning methods for Chinese personal pronoun coreferecne resolution based on those for English and the characteristic of Chinese language. Nowadays, these methods for coreference resolution are

Related Dissertations
More Dissertations