The Impact of Language Mixing on Bilingual LLM Reasoning

ACL ARR 2025 May Submission7479 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Multilingual speakers often switch languages in the middle of a conversation. Similarly, recent reasoning-focused bilingual large language models (LLMs) exhibit language mixing—alternating languages within their chain of thought. Discouraging language mixing in DeepSeek-R1 was found to degrade accuracy, suggesting that language mixing may benefit reasoning performance. In this work, we study language switching in Chinese-English bilingual reasoning models. We identify reinforcement learning with outcome-based rewards as the critical training stage that leads to language mixing. We demonstrate that language mixing can enhance reasoning: enforcing monolingual decoding reduces accuracy by 2\% on math reasoning tasks. We further show that a lightweight probe can predict whether a potential language switch would benefit or harm reasoning, and use this to guide decoding, increasing accuracy by up to 4.10\%. Our findings suggest that language mixing is not merely a byproduct of multilingual training, but is a strategic reasoning behavior.
Paper Type: Long
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: code-switching, mixed language, multilingualism, computational psycholinguistics, chain-of-thought, probing
Contribution Types: Model analysis & interpretability
Languages Studied: English, Chinese
Submission Number: 7479
Loading