Abstract: In this study, we investigate whether non-English-centric large language models, `think' in their specialized language. Specifically, we analyze how intermediate layer representations, when projected into the vocabulary space, favor certain languages during generation—termed as latent languages. We categorize non-English-centric models into two groups: CPMs, which are English-centric models with continued pre-training on its specialized language, and BLMs, which are pre-trained on a balanced mix of multiple languages from scratch. Our findings reveal that while English-centric models rely exclusively on English as their latent language, non-English-centric models activate multiple latent languages, dynamically selecting the most similar one based on both the source and target languages. This also influences responses to culture difference questions, reducing English-centric biases in non-English models. This study deepens our understanding of language representation in non-English-centric LLMs, shedding light on the intricate dynamics of multilingual processing at the representational level.
Paper Type: Long
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: Interpretability and Analysis of Models for NLP, Multilingualism and Cross-Lingual NLP, Ethics, Bias, and Fairness
Contribution Types: Model analysis & interpretability
Languages Studied: English, Japanese, Chinese, French, Arabic
Submission Number: 3798
Loading