Zero-shot Dual Machine Translation

Lierni Sestorain; Massimiliano Ciaramita; Christian Buck; Thomas Hofmann

Zero-shot Dual Machine Translation

Lierni Sestorain, Massimiliano Ciaramita, Christian Buck, Thomas Hofmann

27 Sept 2018 (modified: 22 Jun 2025)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Neural Machine Translation (NMT) systems rely on large amounts of parallel data.This is a major challenge for low-resource languages. Building on recent work onunsupervised and semi-supervised methods, we present an approach that combineszero-shot and dual learning. The latter relies on reinforcement learning, to exploitthe duality of the machine translation task, and requires only monolingual datafor the target language pair. Experiments on the UN corpus show that a zero-shotdual system, trained on English-French and English-Spanish, outperforms by largemargins a standard NMT system in zero-shot translation performance on Spanish-French (both directions). We also evaluate onnewstest2014. These experimentsshow that the zero-shot dual method outperforms the LSTM-based unsupervisedNMT system proposed in (Lample et al., 2018b), on the en→fr task, while onthe fr→en task it outperforms both the LSTM-based and the Transformers-basedunsupervised NMT systems.

Keywords: unsupervised, machine translation, dual learning, zero-shot

TL;DR: A multilingual NMT model with reinforcement learning (dual learning) aiming to improve zero-shot translation directions.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/zero-shot-dual-machine-translation/code)

10 Replies

Loading