Abstract: Cross-lingual alignment in multilingual language models has been an active field of research in recent years.We survey the literature of techniques, both to train well-aligned models, and to improve the cross-lingual alignment of pre-trained encoders. Compiling evaluation results and method summaries, we give an overview of which methods work better than others. We further show how to understand cross-lingual alignment and its limitations. Finally, we discuss how these insights may be applied not only to encoder models, where this topic has been heavily studied, but also to encoder-decoder or even decoder-only models. In generative models, the focus must be on an effective trade-off between language-neutral and language-specific information.
Paper Type: long
Research Area: Multilinguality and Language Diversity
Contribution Types: Surveys
Languages Studied: multilingual models
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies
Loading