Curse of bilinguality: Evaluating monolingual and bilingual language models on Chinese linguistic benchmarks

ACL ARR 2025 February Submission3290 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We investigate cross-lingual transfer in large language models (LLMs) trained on two high-resource languages, English and Chinese. Four monolingual Chinese and four bilingual English-Chinese models are evaluated on two Chinese linguistic benchmarks. Monolingual models consistently outperform the bilingual ones on 12 out of 55 tasks, a result indicating negative transfer from English to Chinese. Additionally, we carry out a feature attribution analysis in a monolingual and a bilingual model, showing that the differences in their performance may be explained by more predictable attribution patterns in the monolingual model. Our findings have implications for the ongoing effort of training bilingual LLMs.
Paper Type: Short
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: cross-lingual transfer, multilingual pre-training
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English, Chinese
Submission Number: 3290
Loading