LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models

LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models

ACL ARR 2025 May Submission4907 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Nowadays, Large Language Models (LLMs) have attracted widespread attention due to their powerful performance. However, due to the unavoidable exposure to socially biased data during training, LLMs tend to exhibit social biases, particularly gender bias. To better explore and quantifying the degree of gender bias in LLMs, we propose a pair of datasets named GenBiasEval and GenHintEval, respectively. The GenBiasEval is responsible for evaluating the degree of gender bias in LLMs, accompanied by an evaluation metric named AFGB-Score ($\textbf{A}$bsolutely $\textbf{F}$air $\textbf{G}$ender $\textbf{B}$ias $\textbf{Score}$). Meanwhile, the GenHintEval is used to assess whether LLMs can provide responses consistent with prompts that contain gender hints, along with the accompanying evaluation metric UB-Score ($\textbf{U}$n$\textbf{B}$ias $\textbf{Score}$). Besides, in order to mitigate gender bias in LLMs more effectively, we present the LFTF ($\textbf{L}$ocating $\textbf{F}$irst and $\textbf{T}$hen $\textbf{F}$ine-Tuning) algorithm.The algorithm first ranks specific LLM blocks by their relevance to gender bias in descending order using a metric called BMI ($\textbf{B}$lock $\textbf{M}$itigating $\textbf{I}$mportance Score). Based on this ranking, the block most strongly associated with gender bias is then fine-tuned using a carefully designed loss function. Numerous experiments have shown that our proposed LFTF algorithm can significantly mitigate gender bias in LLMs while maintaining their general capabilities.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: Large language models, gender bias, model editing

Languages Studied: English

Submission Number: 4907

Loading