Abstract: Neural machine translation (NMT) systems have reached state of the art performance in translating text and widely deployed. Yet little is understood about how these systems function or break. Here we show that NMT systems are susceptible to producing highly pathological translations that are completely untethered from the source material, which we term hallucinations. Such pathological translations are problematic because they are are deeply disturbing of user trust and easy to find. We describe a method t generate hallucinations and show that many common variations of the NMT architecture are susceptible to them. We study a variety of approaches to reduce the frequency of hallucinations, including data augmentation, dynamical systems and regularization techniques and show that data augmentation significantly reduces hallucination frequency. Finally, we analyze networks that produce hallucinations and show signatures of hallucinations in the attention matrix and in the stability measures of the decoder.
Keywords: translation, dynamics, chaos, adversarial, nmt, rnn
TL;DR: We introduce and analyze the phenomenon of "hallucinations" in NMT, or spurious translations unrelated to source text, and propose methods to reduce its frequency.