Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Translator

Research on Structure Transition Technology for SMT

Author CuiLingYun
Tutor ZhaoTieJun
School Harbin Institute of Technology
Course Computer Science and Technology
Keywords machine translation structure transition parameter training metastructure language model
CLC TP391.2
Type Master's thesis
Year 2008
Downloads 63
Quotes 0
Download Dissertation

In the field of machine translation, phrase-based MT approach is the most mature and stable method, but now it is difficult to improve. The phrase-based model can give a more accurate translation if the phrase has appeared in the corpus. The more accurate translation includes the choice of words and the order adjustment of the words with in the phrase. However, it is difficult to give an effective solution to dealing with the unseen phrases and the order adjustment between phrases, because this method didn’t use syntax information and complex semantics knowledge. For this reason, people hope to introduce the deeper linguistics structure to improve the performance of the existing methods. One of the most direct ideas is to introduce syntactic structure and built SMT model in this structure.This paper presents a novel structure transition model from Recombination of MS (RM) on the source side to the RM on the target side. In this model translation pair of source RM and target RM with their word alignments are learned automatically form the parsed and aligned corpus, and we compute the translation probability. To control the order adjustment we analyze the difference between linear language model and the non-linear language model, and imply a non-linear model of meta-structure.In this paper, minimum error rate training method was used to train the parameters of log-linear model. It used the same evaluation criteria BLEU with the Automatic Evaluation System. The multi-dimension optimization problem is settled in different one-dimension spaces. We give the algorithm to find the set of the discrete solution candidates for the continuous multi-dimension optimization problem. The training time needed for the model is reduced and the results valued with BLEU are improved.Finally, we combined the structure transition model with the SMT system, and use the structure transition model to hide the heterogeneous phenomenon of syntactic structure between source and target languages. And also, long-distance order adjustment became local RM order adjustment. Experiments show that the model significantly outperforms Pharaoh, a phrase-based system.

Related Dissertations
More Dissertations