Bias Amplification: Large Language Models as Increasingly Biased Media

Bias Amplification: Large Language Models as Increasingly Biased Media

ACL ARR 2025 February Submission242 Authors

05 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Model collapse—a phenomenon where models degrade in performance due to indiscriminate use of synthetic data—is well studied. However, its role in bias amplification—the progressive reinforcement of pre-existing social biases in Large Language Models (LLMs)—remains underexplored. In this paper, we formally define the conditions for bias amplification and demonstrate through statistical simulations that bias can intensify even in the absence of sampling errors, the primary driver of model collapse. Empirically, we investigate political bias amplification in GPT-2 using a custom-built benchmark for sentence continuation tasks. Our findings reveal a progressively increasing right-leaning bias. Furthermore, we evaluate three mitigation strategies—Overfitting, Preservation, and Accumulation—and show that bias amplification persists even when model collapse is mitigated. Finally, a mechanistic interpretation identifies distinct sets of neurons responsible for model collapse and bias amplification, suggesting they arise from different underlying mechanisms.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: bias/toxicity, human-in-the-loop, transparency, model bias/fairness evaluation, model bias/unfairness mitigation, ethical considerations in NLP applications, generalization, probing, data augmentation

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data analysis

Languages Studied: English

Submission Number: 242

Loading