Keywords: Large Language Models, Retrieval-Augmented Generation, Fairness
Abstract: Large Language Models (LLMs) used in Retrieval-Augmented Generation (RAG) can amplify demographic bias: retrievers may surface skewed context and generators can propagate that skew into decisions. Prior work typically treats fairness in retrieval or generation in isolation, leaving end-to-end fairness in RAG underexplored. We propose a post-hoc pipeline that jointly controls both stages: (i) a Fair Greedy Reranker (FGR) that builds prefix-balanced slates toward a target group mix; (ii) a Residual Slate Bias Estimator (RSBE) using signed, prefix-sensitive NDKL to quantify remaining skew; and (iii) Confidence-Gated Logit Calibration (CGLC) that converts the residual signal into small and margin-focused logit corrections without retraining. On an occupation classification task, our approach reduces retriever-side skew (lowest NDKL among baselines for both dense and sparse retrievers) and achieves the lowest generator-side disparity (e.g., Risk Difference) while largely preserving utility. The same calibration can be tuned to alternative fairness criteria (e.g., Equal Opportunity) with minimal utility loss.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: model bias/unfairness mitigation; retrieval-augmented generation
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 4253
Loading