Reviewed Version (pdf): https://openreview.net/references/pdf?id=l_cASAkH-9
Keywords: Natural Language Inference, Natural Language Understanding, Natural Language Processing, Gender Bias, Societal Bias, Bias, Ethics, Debiasing Techniques, Data Augmentation
Abstract: Gender-bias stereotypes have recently raised significant ethical concerns in natural language processing. However, progress in the detection and evaluation of gender-bias in natural language understanding through inference is limited and requires further investigation. In this work, we propose an evaluation methodology to measure these biases by constructing a probe task that involves pairing a gender-neutral premise against a gender-specific hypothesis. We use our probe task to investigate state-of-the-art NLI models on the presence of gender stereotypes using occupations. Our findings suggest that three models (BERT, RoBERTa, and BART) trained on MNLI and SNLI data-sets are significantly prone to gender-induced prediction errors. We also find that debiasing techniques such as augmenting the training dataset to ensure that it is a gender-balanced dataset can help reduce such bias in certain cases.
One-sentence Summary: We propose an evaluation methodology by constructing a challenge task to demonstrate that gender bias is exhibited in state-of-the-art finetuned Transformer-based NLI model outputs and explore an existing debiasing technique for mitigation of bias.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics