Step-by-Step Evaluation of Gender Bias in Large Language Models

ACL ARR 2024 June Submission4813 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) tend to internalize and reproduce discriminatory societal biases. A natural language reasoning process provided by Chain-of-Thought (CoT) prompting helps determine whether the LLM is reasoning based on correct grasp. However, it is not clarified whether such information provided by CoT leads to accurately evaluating the LLM’s social biases. In this paper, we introduce a benchmark to evaluate gender-related social biases based on the step-by-step process using CoT prompts. We construct the benchmark for an English reasoning task where the LLM is given a list of words comprising feminine, masculine, and gendered occupational words, and is required to count the number of feminine and masculine words. Our CoT prompts require the LLM to explicitly indicate whether each word in the word list is feminine or masculine. Experimental results show that considering both the step-by-step process and predictions of LLMs improves the quality of bias evaluation. Furthermore, despite the simplicity of the task of counting words, our benchmark produces evaluations of gender-related social biases that are comparable to existing human-scratched benchmarks.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Large Language Models, Gender bias evaluation, step-by-step
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 4813
Loading