Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation

ACL ARR 2025 February Submission2473 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Mitigation of biases, such as language models' reliance on gender stereotypes, is a crucial endeavor required for the creation of reliable and useful language technology. The crucial aspect of debiasing is to ensure that the models preserve their versatile capabilities, including their ability to solve language tasks and equitably represent various genders. To address these issues, we introduce *Dual Dabiasing Algorithm through Model Adaptation* (*2DAMA*). Novel *Dual Debiasing* enables robust reduction of stereotypical bias while preserving desired factual gender information encoded by language models. We show that *2DAMA* effectively reduces gender bias in language models for English and is one of the first approaches facilitating the mitigation of their stereotypical tendencies in translation. The proposed method's key advantage is the preservation of factual gender cues, which are useful in a wide range of natural language processing tasks.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: Gender Bias, Model Editing, Multilingual Debiasing
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English, Czech, German, Russian
Submission Number: 2473
Loading