Submission Type: Regular Long Paper
Submission Track: Resources and Evaluation
Submission Track 2: Theme Track: Large Language Models and the Future of NLP
Keywords: Large Language Models, Instruction-Following, Bias
TL;DR: Prompt Association Test (P-AT) is a new resource for testing the presence of social biases in Instruction-Following Language Models (IFLMs).
Abstract: Instruction-Following Language Models (IFLMs) are promising and versatile tools for solving many downstream, information-seeking tasks. Given their success, there is an urgent need to have a shared resource to determine whether existing and new IFLMs are prone to produce biased language interactions. In this paper, we propose Prompt Association Test (P-AT): a new resource for testing the presence of social biases in IFLMs.
P-AT stems from WEAT (Caliskan et al., 2017) and generalizes the notion of measuring social biases to IFLMs. Basically, we cast WEAT word tests in promptized classification tasks, and we associate a metric - the bias score. Our resource consists of 2310 prompts. We then experimented with several families of IFLMs discovering gender and race biases in all the analyzed models.
We expect P-AT to be an important tool for quantifying bias across different dimensions and, therefore, for encouraging the creation of fairer IFLMs before their distortions have consequences in the real world.
Submission Number: 3405
Loading