Contrastive Learning for Low Resource Machine TranslationDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Representation learning plays a vital role in natural language processing tasks. More recent works study the geometry of the representation space for each layer of pre-trained language models. They find that the context representation of all words is not isotropic in any layer of the pre-trained language model. However, how contextual are the contextualized representations produced by transformer-based machine translation models? In this paper, we find that the contextualized representations of the same word in different contexts have a greater cosine similarity than those of two different words, but this self-similarity is still relatively low between the same words. This suggests that output of machine translation models produce more context-specific representations. In this work, we present a contrastive framework for machine translation, that adopts contrastive learning to train model in a supervised way. By making use of data augmentation, our supervised contrastive learning method solves the issue of low-resource machine translation representations learning. Experimental results on the IWSLT14 and WMT14 datasets show our method can outperform competitive baselines significantly.
0 Replies

Loading