Enabling Unsupervised Neural Machine Translation with Word-level Visual Representations

Chengpeng Fu; Xiaocheng Feng; Yichong Huang; Wenshuai Huo; Hui Wang; Bing Qin; Ting Liu

Enabling Unsupervised Neural Machine Translation with Word-level Visual Representations

Chengpeng Fu, Xiaocheng Feng, Yichong Huang, Wenshuai Huo, Hui Wang, Bing Qin, Ting Liu

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Machine Translation

Submission Track 2: Speech and Multimodality

Keywords: Unsupervised Machine Translation, Cross-modal Machine Translation, Word-level Image

TL;DR: Improving unsupervised machine translation with word-level visual representation to address lexical confusion

Abstract: Unsupervised neural machine translation has recently made remarkable strides, achieving impressive results with the exclusive use of monolingual corpora. Nonetheless, these methods still exhibit fundamental flaws, such as confusing similar words. A straightforward remedy to rectify this drawback is to employ bilingual dictionaries, however, high-quality bilingual dictionaries can be costly to obtain. To overcome this limitation, we propose a method that incorporates images at the word level to augment the lexical mappings. Specifically, our method inserts visual representations into the model, modifying the corresponding embedding layer information. Besides, a visible matrix is adopted to isolate the impact of images on other unrelated words. Experiments on the Multi30k dataset with over 300,000 self-collected images validate the effectiveness in generating more accurate word translation, achieving an improvement of up to $+$2.81 BLEU score, which is comparable or even superior to using bilingual dictionaries.

Submission Number: 1773

Loading