TL;DR: We describe actionable directions for systematizing LLM bias studies by connecting probes and the constructs they are intended to measure.
Abstract: The proliferation of LLM bias probes introduces three challenges: we lack (1) principled criteria for selecting appropriate probes, (2) a system for reconciling conflicting results across probes, and (3) formal frameworks for reasoning about when and why experimental findings will generalize to real user behavior. In response, we propose a systematic approach to LLM social bias probing, drawing on insights from the social sciences. Central to this approach is EcoLevels—a novel framework that helps (a) identify appropriate bias probes (b) reconcile conflicting results, and (c) generate predictions about bias generalization. We ground our framework in the social sciences, as many LLM probes are adapted from human studies, and these fields have faced similar challenges when studying bias in humans. Finally, we outline five lessons that demonstrate how LLM bias probing can (and should) benefit from decades of social science research
Lay Summary: Given that millions of people use Large Language Models (LLMs) each day, researchers have developed tools to understand whether the behaviors of these AI models reflect social biases. However, the present paper argues that the sheer number of tools ("bias probes") has created three problems: (1) there is no clear guidance for picking what kind of probe to use when testing a model for bias, (2) different probes often give conflicting results, and (3) it's hard to know how these results apply to real-world use. To fix this, the authors propose a new framework called “EcoLevels,,” which borrows ideas from social science research. This approach helps researchers choose better tools for testing for bias, make sense of mixed results, and predict when a model’s bias might actually affect people. The goal is to bring more structure and theory to the way we study bias in AI systems.
Primary Area: Research Priorities, Methodology, and Evaluation
Keywords: bias probing, LLMs, EcoLevels, interdisciplinary, social bias, psychological theory
Submission Number: 397
Loading