Leveraging In-Context Learning for Political Bias Testing of LLMs

ACL ARR 2024 June Submission1417 Authors

14 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: A growing body of work has been querying LLMs with political questions to evaluate their potential biases. However, this probing method has limited stability, making comparisons between models unreliable. In this paper, we argue that LLMs need more context. We propose a new probing task, Questionnaire Modeling, that uses human survey data as in-context examples. We show that Questionnaire Modeling improves the stability of question-based bias evaluation, and demonstrate that it may be used to compare instruction-tuned models to their base versions. Experiments with two open-source LLMs indicate that instruction tuning can indeed change the direction of bias. Data and code are publicly available.
Paper Type: Short
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: probing,robustness
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 1417
Loading