Abstract: In this work, we propose two extensions of standard word lexicons in statistical machine translation: A discriminative word lexicon that uses sentence-level source information to predict the target words and a trigger-based lexicon model that extends IBM model 1 with a second trigger, allowing for a more fine-grained lexical choice of target words. The models capture dependencies that go beyond the scope of conventional SMT models such as phrase-and language models. We show that the models improve translation quality by 1% in BLEU over a competitive baseline on a large-scale task.
0 Replies
Loading