Bias Dynamics in BabyLMs: Towards a Compact Sandbox for Democratising Pre-Training Debiasing

Published: 14 Dec 2025, Last Modified: 14 Dec 2025LM4UC@AAAI2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Debiasing, Democratising, Language Model, Bias
TL;DR: BabyLMs reproduce the bias and debiasing dynamics of larger LMs, serving as an effective proxy that reduces the compute required for pre-model debiasing experiments from over 500 to ~30 GPU-hours, substantially democratising debiasing research.
Abstract: Pre-trained language models (LMs) have, over the last few years, grown substantially in both societal adoption and training costs. This rapid growth in size has constrained progress in understanding and mitigating their biases, especially towards under-represented communities. Since re-training LMs is prohibitively expensive, most debiasing work has focused on post-hoc or masking-based strategies, which often fail to address the underlying causes of bias. In this work, we seek to democratise pre-model debiasing research by using low-cost proxy models, striving to make this research direction accessible to projects outside of the large industry labs. Specifically, we investigate BabyLMs, compact BERT-like models trained on small and mutable corpora that can simulate the bias acquisition and learning dynamics of larger models. We show that BabyLMs display closely aligned patterns of intrinsic bias formation and performance development compared to standard BERT models, despite their drastically reduced size. Furthermore, correlations between BabyLMs and BERT hold across multiple intra-model and post-model debiasing methods. Leveraging these similarities, we conduct pre-model debiasing experiments with BabyLMs, replicating prior findings and presenting new insights regarding the influence of gender imbalance and toxicity on bias formation. Our results demonstrate that BabyLMs can serve as an effective sandbox for large-scale LMs, reducing pre-training costs from over 500 GPU-hours to just over 30 GPU-hours. This provides a way to democratise pre-model debiasing research by enabling faster, more accessible exploration of novel debiasing strategies and the examination of historically under-explored bias topics in service of building fairer LMs.
Submission Number: 10
Loading