Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality

Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality

ACL ARR 2024 December Submission12 Authors

02 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Innovative transformer-based language models produce contextually-aware token embeddings and have achieved state-of-the-art performance for a variety of natural language tasks, but have been shown to encode unwanted biases for downstream applications. In this paper, we evaluate the social biases encoded by transformers trained with the masked language modeling objective using proposed proxy functions within an iterative masking experiment to measure the quality of transformer models’ predictions, and assess the preference of MLMs towards disadvantaged and advantaged groups. We compare bias estimations with those produced by other evaluation methods using benchmark datasets and assess their alignment with human annotated biases. We find relatively high religious and disability biases across considered MLMs and low gender bias in one dataset relative to another. We extend on previous work by evaluating social biases introduced after re-training an MLM under the masked language modeling objective, and find that proposed measures produce more accurate estimations of biases introduced by re-training MLMs than others based on relative preference for biased sentences between models.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: model bias/fairness evaluation, model bias/unfairness mitigation, ethical considerations in NLP applications, reflections and critiques

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 12

Loading