DR.GAP: Mitigating Bias in Large Language Models using Gender-Aware Prompting with Demonstration and Reasoning
Abstract: Large Language Models (LLMs) exhibit strong natural language processing capabilities but also inherit and amplify societal biases, including gender bias, raising fairness concerns. Existing debiasing methods face significant limitations: parameter tuning requires access to model weights, prompt-based approaches often degrade model utility, and optimization-based techniques lack generalizability. To address these challenges, we propose *DR.GAP* (*D*emonstration and *R*easoning for *G*ender-*A*ware *P*rompting), an automated and model-agnostic approach that mitigates gender bias while preserving model performance. *DR.GAP* selects bias-revealing examples and generates structured reasoning to guide models toward more impartial responses. Extensive experiments on coreference resolution and QA tasks across multiple LLMs (`GPT-3.5`, `Llama3`, and `Llama2-Alpaca`) demonstrate its effectiveness, generalization ability, and robustness. DR.GAP can generalize to vision-language models (VLMs), achieving significant bias reduction.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: model bias/fairness evaluation,model bias/unfairness mitigation
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 4928
Loading