Gender Bias, Recency and Recall in Large Language Models: Which Scientists and Movie Stars Does ChatGPT Forget?
Abstract: Large Language Models (LLMs) are increasingly used as a tool to access factual information. However, when prompted to answer factual questions LLMs frequently generate incorrect “hallucinated” responses, thus displaying imperfect recall. Given the known gender biases in LLMs, we study the prevalence of gender-based disparities in LLM responses to factual questions. Specifically, we examine the degree to which ChatGPT exhibits gender-based differences in recall for Noble Prize winners and Oscar award recipients. Our results confirm that there are gender-based differences in recall, but that the level of bias varies significantly with both subject matter factors like recency or prominence and model parameters like creativity.
Paper Type: short
Research Area: Ethics, Bias, and Fairness
Contribution Types: Publicly available software and/or pre-trained models
Languages Studied: English
0 Replies
Loading