Natural Language Reasoning Process Enhances Binary Gender Bias Evaluation

ACL ARR 2024 December Submission1842 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract:

Large language models (LLMs) tend to internalize and reproduce discriminatory societal biases. A natural language reasoning process provided by Chain-of-Thought (CoT) prompting helps determine whether the LLM is reasoning based on correct grasp. However, it is not clarified whether such information provided by CoT leads to an accurate evaluation of the LLM’s gender biases. This paper investigates how the effectiveness of the step-by-step process using CoT prompts affects gender bias evaluation results. Since creating step-by-step processes for evaluation by humans is costly, we automatically create a benchmark for social bias evaluation based on templates. Specifically, we construct the benchmark for an English reasoning task where the LLM is given a list of words comprising demographic attributes (e.g. gender and race) and occupational words and is required to count the number of demographic attributes words. Our CoT prompts require the LLM to explicitly indicate whether each word in the word list is related to a demographic attribute. Experimental results show that considering both the step-by-step process and predictions of LLMs improves the quality of bias evaluation. Furthermore, the same tendencies are observed in eight social biases such as race and religion evaluation datasets.

Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Large Language Models, Gender bias evaluation, step-by-step
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 1842
Loading