Multilingual Alignment of Contextual Word Representations

Steven Cao; Nikita Kitaev; Dan Klein

Multilingual Alignment of Contextual Word Representations

Steven Cao, Nikita Kitaev, Dan Klein

Published: 20 Dec 2019, Last Modified: 22 Jun 2025ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: multilingual, natural language processing, embedding alignment, BERT, word embeddings, transfer

TL;DR: We propose procedures for evaluating and strengthening contextual embedding alignment and show that they both improve multilingual BERT's zero-shot XNLI transfer and provide useful insights into the model.

Abstract: We propose procedures for evaluating and strengthening contextual embedding alignment and show that they are useful in analyzing and improving multilingual BERT. In particular, after our proposed alignment procedure, BERT exhibits significantly improved zero-shot performance on XNLI compared to the base model, remarkably matching pseudo-fully-supervised translate-train models for Bulgarian and Greek. Further, to measure the degree of alignment, we introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer. Using this word retrieval task, we also analyze BERT and find that it exhibits systematic deficiencies, e.g. worse alignment for open-class parts-of-speech and word pairs written in different scripts, that are corrected by the alignment procedure. These results support contextual alignment as a useful concept for understanding large multilingual pre-trained models.

Data: [MultiNLI](https://paperswithcode.com/dataset/multinli), [XNLI](https://paperswithcode.com/dataset/xnli)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/multilingual-alignment-of-contextual-word/code)

Original Pdf: pdf

12 Replies

Loading