Research of Uyghur-Chinese Machine Translation System Combination Based on Semantic Information

Published: 01 Jan 2019, Last Modified: 19 May 2025NLPCC (2) 2019EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Uyghur-Chinese Machine Translation System Combination bears some drawbacks of not considering semantic information when doing the combination and the individual systems which participated in system combination lacking diversity. This paper tackles these problems by proposing a system combination method which was generated multiple new systems from a single Statistical Machine Translation (SMT) engine and combined together. These new systems are generated based on a bilingual phrase semantic representation model. Specifically, the Uyghur-Chinese bilingual phrase bilinear semantic similarity score and cosine semantic similarity score were firstly computed by a bilingual phrase semantic representation model and then several new systems were generated by adding features to the original feature set of the phrase-based translation model by static features and dynamic features. Finally, the newly generated system is combined with the baseline system to obtain the final combination results. Experimental results on the Uyghur-Chinese CWMT2013 test sets show that our approach significantly outperforms the baseline by 0.63 BLEU points respectively.
Loading