(Dis)improved?! How Simplified Language Affects Large Language Model Performance across Languages

ACL ARR 2024 December Submission429 Authors

13 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Simplified language enhances the accessibility and human understanding of texts. However, whether it also benefits large language models (LLMs) remains underexplored. This paper extensively studies whether LLM performance improves on simplified data compared to its original counterpart. Our experiments span six datasets and eight automatic simplification systems across three languages. We show that English models, including GPT-4o-mini, exhibit a significant performance drop on simplified data. This introduces an intriguing paradox: simplified data is helpful for humans but not for LLMs. At the same time, the performance in non-English languages sometimes improves, depending on the task and quality of the simplifier. Our findings offer a comprehensive view of the potential and limitations of simplified language for LLM performance and uncover severe implications for people depending on simple language.
Paper Type: Long
Research Area: Human-Centered NLP
Research Area Keywords: value-centered design, human factors in NLP, values and culture
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English, German, Russian
Submission Number: 429
Loading