- Abstract: Multilingual Neural Machine Translation (NMT) systems are capable of translating between multiple source and target languages within a single system. An important indicator of generalization within these systems is the quality of zero-shot translation - translating between language pairs that the system has never seen during training. However, until now, the zero-shot performance of multilingual models has lagged far behind the quality that can be achieved by using a two step translation process that pivots through an intermediate language (usually English). In this work, we diagnose why multilingual models under-perform in zero shot settings. We propose explicit language invariance losses that guide an NMT encoder towards learning language agnostic representations. Our proposed strategies significantly improve zero-shot translation performance on WMT English-French-German and on the IWSLT 2017 shared task, and for the first time, match the performance of pivoting approaches while maintaining performance on supervised directions.
- Keywords: Machine Translation, Multi-lingual processing, Zero-Shot translation
- TL;DR: Simple similarity constraints on top of multilingual NMT enables high quality translation between unseen language pairs for the first time.