Globetrotter: Unsupervised Multilingual Translation from Visual AlignmentDownload PDF

28 Sept 2020 (modified: 22 Oct 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: cross-modal, multilingual, unsupervised translation, visual similarity
Abstract: Machine translation in a multi-language scenario requires large-scale parallel corpora for every language pair. Unsupervised translation is challenging because there is no explicit connection between languages, and the existing methods have to rely on topological properties of the language representations. We introduce a framework that leverages visual similarity to align multiple languages, using images as the bridge between them. We estimate the cross-modal alignment between language and images, and use this estimate to guide the learning of cross-lingual representations. Our language representations are trained jointly in one model with a single stage. Experiments with fifty-two languages show that our method outperforms prior work on unsupervised word-level and sentence-level translation using retrieval.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: We propose a method that leverages cross- modal alignment between language and vision to train a multilingual translation system without any parallel corpora.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2012.04631/code)
Reviewed Version (pdf): https://openreview.net/references/pdf?id=J8BUMatCxQ
14 Replies

Loading