Mitigating Bias in Natural Language Inference Using Adversarial Learning

Yonatan Belinkov; Adam Poliak; Stuart M. Shieber; Benjamin Van Durme

Mitigating Bias in Natural Language Inference Using Adversarial Learning

Yonatan Belinkov, Adam Poliak, Stuart M. Shieber, Benjamin Van Durme

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Recognizing the relationship between two texts is an important aspect of natural language understanding (NLU), and a variety of neural network models have been proposed for solving NLU tasks. Unfortunately, recent work showed that the datasets these models are trained on often contain biases that allow models to achieve non-trivial performance without possibly learning the relationship between the two texts. We propose a framework for building robust models by using adversarial learning to encourage models to learn latent, bias-free representations. We test our approach in a Natural Language Inference (NLI) scenario, and show that our adversarially-trained models learn robust representations that ignore known dataset-specific biases. Our experiments demonstrate that our models are more robust to new NLI datasets.

Keywords: natural language inference, adversarial learning, bias, artifacts

TL;DR: Adversarial learning methods encourage NLI models to ignore dataset-specific biases and help models transfer across datasets.

9 Replies

Loading