Extending Statistical Machine Translation with Discriminative and Trigger-Based Lexicon Models

Arne Mauser, Sasa Hasan, Hermann Ney

2009 (modified: 10 Nov 2022)EMNLP 2009Readers: Everyone

Abstract: In this work, we propose two extensions of standard word lexicons in statistical machine translation: A discriminative word lexicon that uses sentence-level source information to predict the target words and a trigger-based lexicon model that extends IBM model 1 with a second trigger, allowing for a more fine-grained lexical choice of target words. The models capture dependencies that go beyond the scope of conventional SMT models such as phrase-and language models. We show that the models improve translation quality by 1% in BLEU over a competitive baseline on a large-scale task.

0 Replies