Selection Collider Bias in Large Language ModelsDownload PDF

Published: 09 Jul 2022, Last Modified: 20 Apr 2025CRL@UAI 2022 PosterReaders: Everyone
Keywords: large language models, causal inference, selection bias
TL;DR: Large language models (LLMs) can learn statistical dependencies between otherwise unconditionally independent variables.
Abstract: In this paper we motivate the causal mechanisms behind sample selection collider bias in Large Language Models (LLMs). We show that selection collider bias can be amplified in underspecified learning tasks, and that the magnitude of the resulting spurious correlations appear scale agnostic. While selection collider bias can be pervasive and difficult to overcome, we describe a method to exploit the resulting spurious associations for measurement of when a model may be uncertain about its prediction, and demonstrate it on an extended version of the Winogender Schemas evaluation set.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/selection-collider-bias-in-large-language/code)
3 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview