Losing our Tail, Again: (Un)Natural Selection & Multilingual LLMs

ACL ARR 2026 January Submission7948 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLMs; Model Collapse, Position Paper, Multilinguality, Diversity, Language Change, Machine Translation
Abstract: Multilingual Large Language Models (LLMs) considerably changed how technologies can influence language. While previous technologies could mediate or assist humans, there is now a tendency to offload the task of writing itself to these technologies, enabling these models to change our languages more directly. While they provide us quick access to information and impressively fluent output, beneath their (apparent) sophistication lies a subtle, more insidious threat: the gradual decline and loss of linguistic diversity. In this position paper, I explore how model collapse, with a particular focus on translation technology, can lead to the loss of linguistic forms, grammatical features, and cultural nuance. Model collapse refers to the eventual consequence of self-consuming training loops, where automatically generated data enters the training data thereby reinforcing biases and losing linguistic diversity. Drawing on recent work in Computer Vision, Natural Language Processing and Machine Translation, I argue that the tails of our linguistic distributions are vanishing, and with them, the narratives and identities they carry. This paper is a call to resist linguistic flattening and to reimagine Natural Language Processing as a field that encourages, values and protects expressive multilingual diversity and creativity.
Paper Type: Long
Research Area: Multilinguality and Language Diversity
Research Area Keywords: Model Collapse; Diversity; Multilingualism; Machine Translation; Statistical Tails
Contribution Types: Position papers
Languages Studied: N/A since it is a Position Paper (but most of the related work focuses on English)
Submission Number: 7948
Loading