Macedon: Minimizing Representation Coding Rate Reduction for Cross-Lingual Natural Language Understanding

Haoyu Wang; Yaqing Wang; Huaxiu Yao; Jing Gao

Macedon: Minimizing Representation Coding Rate Reduction for Cross-Lingual Natural Language Understanding

Haoyu Wang, Yaqing Wang, Huaxiu Yao, Jing Gao

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Machine Learning for NLP

Submission Track 2: NLP Applications

Keywords: cross-lingual, rate reduction

Abstract: Cross-lingual natural language understanding(NLU) is one of the fundamental tasks of NLP. The goal is to learn a model which can generalize well on both high-resource and low-resource language data. Recent pre-trained multilingual language models, e.g., multilingual BERT, XLM, have shown impressive performance on cross-lingual NLU tasks. However, such promising results request the use of sufficient training data, which is a difficult condition to satisfy for low-resource language. When the data is limited in those low resource languages, the accuracy of existing models will drop. In light of this challenge, we investigate the important task of how to train the cross-lingual model with abundant high-source language data and limited low-resource language data. Existing methods typically learn language-agnostic representation via adversarial training and mutual information estimation. Existing approaches may suffer When data is very limited (e.g., low-resource language) because it is challenging to estimate data distribution accurately. To tackle this issue, we propose a conceptually innovative approach to remove language-associated information via \textbf{m}inimizing represent\textbf{a}tion \textbf{c}oding rate r\textbf{ed}ucti\textbf{on}(Macedon). Specifically, Macedon avoids using extra codes to encode language-related information, which is measured by the rate-distortion function. To validate the effectiveness of Macedon, we conduct extensive experiments on three tasks, including paraphrase identification, natural language inference, and query advertisement matching. The experiment results show that the proposed Macedon outperforms state-of-the-art cross-lingual NLU approaches.

Submission Number: 2091

Loading