Are Structural Concepts Universal in Transformer Language Models? Towards Interpretable Cross-Lingual Generalization

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX
Submission Type: Regular Long Paper
Submission Track: Interpretability, Interactivity, and Analysis of Models for NLP
Submission Track 2: Multilinguality and Linguistic Diversity
Keywords: cross-lingual generalization, syntax, multilinguality, language models
TL;DR: This paper supports a prototype view of structural concepts within LLMs and points out the potential to leverage the conceptual correspondence between languages for cross-lingual generalization, especially to low-resource languages.
Abstract: Large language models (LLMs) have exhibited considerable cross-lingual generalization abilities, whereby they implicitly transfer knowledge across languages. However, the transfer is not equally successful for all languages, especially for low-resource ones, which poses an ongoing challenge. It is unclear whether we have reached the limits of implicit cross-lingual generalization and if explicit knowledge transfer is viable. In this paper, we investigate the potential for explicitly aligning conceptual correspondence between languages to enhance cross-lingual generalization. Using the syntactic aspect of language as a testbed, our analyses of 43 languages reveal a high degree of alignability among the spaces of structural concepts within each language for both encoder-only and decoder-only LLMs. We then propose a meta-learning-based method to learn to align conceptual spaces of different languages, which facilitates zero-shot and few-shot generalization in concept classification and also offers insights into the cross-lingual in-context learning phenomenon. Experiments on syntactic analysis tasks show that our approach achieves competitive results with state-of-the-art methods and narrows the performance gap between languages, particularly benefiting those with limited resources.
Submission Number: 3674
Loading