Contrastive Conditional Masked Language Model for Non-autoregressive Neural Machine TranslationDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Inspired by the success of contrastive learning in natural language processing, we incorporate contrastive learning into the conditional masked language model which is extensively used in non-autoregressive neural machine translation (NAT) that we term Contrastive Conditional Masked Language Model (CCMLM). CCMLM optimizes the similarity of several different representations of the same token in the same sentence, resulting in a richer and more robust representation. We propose two methods to obtain various representations: Contrastive Common Mask and Contrastive Dropout. Positive pairs are various different representations of the same token, while negative pairs are representations of different tokens. In the feature space, the model with contrastive loss pulls positive pairs together and pushes negative pairs away. We conduct extensive experiments on four translation directions with different data sizes. The results demonstrate that CCMLM showed a consistent and significant improvement with margins ranging from 0.80-1.04 BLEU and is state-of-the-art on WMT'16 Ro-En (34.18 BLEU).
0 Replies

Loading