Supervised Phrase Table Triangulation with Neural Word Embeddings for Low-Resource LanguagesDownload PDF

2015 (modified: 16 Jul 2019)EMNLP 2015Readers: Everyone
Abstract: In this paper, we develop a supervised learning technique that improves noisy phrase translation scores obtained by phrase table triangulation. In particular, we extract word translation distributions from small amounts of source-target bilingual data (a dictionary or a parallel corpus) with which we learn to assign better scores to translation candidates obtained by triangulation. Our method is able to gain improvement in translation quality on two tasks: (1) On Malagasy-to-French translation via English, we use only 1k dictionary entries to gain +0.5 Bleu over triangulation. (2) On Spanish-to-French via English we use only 4k sentence pairs to gain +0.7 Bleu over triangulation interpolated with a phrase table extracted from the same 4k sentence pairs.
0 Replies

Loading