GeNRe: a French Gender-Neutral Rewriting System Using Collective Nouns

ACL ARR 2024 June Submission2431 Authors

15 Jun 2024 (modified: 19 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: A significant portion of the textual data used in the field of Natural Language Processing (NLP) exhibits gender biases, particularly due to the use of masculine generics (masculine words that are supposed to refer to mixed groups of men and women), which can perpetuate and amplify stereotypes. Gender rewriting, a NLP task that involves automatically detecting and replacing gendered forms with neutral or opposite forms (e.g., from masculine to feminine), can be employed to mitigate these biases. Such systems are available for English, Arabic, Portuguese and German, but no French system is available. We create an original French gender-neutral rewriting system using collective nouns, which are gender-fixed in French. This paper presents GeNRe, the very first French gender-neutral rewriting system. We introduce a rule-based system (RBS) tailored for the French language alongside two fine-tuned large language models trained on data generated by our RBS. We also explore the use of instruction models to enhance the performance of our other systems and find that Claude 3 Opus combined with our dictionary achieves results close to our RBS. Through this contribution, we hope to promote the advancement of gender bias mitigation techniques in NLP for French.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: model bias/unfairness mitigation
Contribution Types: Publicly available software and/or pre-trained models, Data resources
Languages Studied: French
Submission Number: 2431
Loading