(Dis)improved?! How Simplified Language Affects Large Language Model Performance across Languages

(Dis)improved?! How Simplified Language Affects Large Language Model Performance across Languages

ACL ARR 2025 February Submission1314 Authors

13 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Simplified language enhances the accessibility and human understanding of texts. However, whether it also benefits large language models (LLMs) remains underexplored. This paper extensively studies whether LLM performance improves on simplified data compared to its original counterpart. Our experiments span six datasets and eight automatic simplification systems across three languages. We show that English models, including GPT-4o-mini, show a weak generalization and exhibit a significant performance drop on simplified data. This introduces an intriguing paradox: simplified data is helpful for humans but not for LLMs. At the same time, the performance in non-English languages sometimes improves, depending on the task and quality of the simplifier. Our findings offer a comprehensive view of the impact of simplified language on LLM performance and uncover severe implications for people depending on simple language.

Paper Type: Long

Research Area: Special Theme (conference specific)

Research Area Keywords: LLM generalization, LLM limitations, human factors in NLP

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English, German, Russian

Submission Number: 1314

Loading