Using Chain-of-Thought Prompting for Interpretable Recognition of Social Bias

Published: 23 Oct 2023, Last Modified: 28 Nov 2023SoLaR PosterEveryoneRevisionsBibTeX
Keywords: Large Language Models, Stereotype Detection, Chain of Thought Prompting
TL;DR: This paper shows that inducing increasingly deep reasoning in large language models through zero-shot chain-of-thought prompts drastically improves their ability to accurately identify harmful stereotypes.
Abstract: Given that language models are trained on vast datasets that may contain inherent biases, there is a potential danger of inadvertently perpetuating systemic discrimination. Consequently, it becomes essential to examine and address biases in language models, integrating fairness into their development to ensure that these models are equitable and free of bias. In this work, we demonstrate the importance of reasoning in zero-shot stereotype identification based on Vicuna-13B & -33B and LLaMA-2-chat-13B & -70B. Although we observe improved accuracy by scaling from 13B to larger models, we show that the performance gain from reasoning significantly exceeds the gain from scaling up. Our findings suggest that reasoning is a key factor that enables LLMs to transcend the scaling law on out-of-domain tasks such as stereotype identification. Additionally, through a qualitative analysis of select reasoning traces, we highlight how reasoning improves not just accuracy, but also the interpretability of the decision.
Submission Number: 69