Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Translator

The Research of Decoding Algorithm for Statistical Machine Tranlation

Author LiangHuaCan
Tutor ZhaoTieJun
School Harbin Institute of Technology
Course Computer Science and Technology
Keywords statistical machine translation k-best parsing decoding algorithm synchronous context-free grammar
CLC TP391.2
Type Master's thesis
Year 2008
Downloads 77
Quotes 0
Download Dissertation

Along with the development of statistical machine translation (SMT), we have witnessed the model of SMT’s has experienced periods of word model, phrases model, formal syntax model, tree-to-string model string-to-tree model. At present some scholars are even trying to build a tree–to-tree model. So many complicated models there are, and such a variety of decoders for them.This paper mainly introduces a generalized decoding algorithm that is based on k-best parsing technology. We make little changes on different modes so that they can be expressed by synchronous context-free grammar (SCFG). Then we parse the source sentence with a monolingual k-best parsing algorithm. Since every rule of SCFG has two sides, the parse tree of the target language can be generated along with the parse work on source side synchronously. We merged a variety of features in our decoder with log-linear model. Scores of the SCFG rules can be gotten by summing up the nature logarithm of the feature values with weights. And the scores accumulate while the parse tree is generating in the parse work on the source sentence. So we can find the k-best derivations in the root vertex of the parse tree, which means we find the k-best translations of the source sentence.We also introduce a popular decoding algorithm for phrase model, which is based on finite automaton. We have made experiments on it and our generalized decoder, which turns out that they have equivalent ability on making translations on the same phrase model. Moreover, we have made experiments on several models with our generalized decoder, which turns out that the more prior knowledge we add into the model, the better translation we make.

Related Dissertations
More Dissertations