Beyond Accuracy: Revisiting Out-of-Distribution Generalization in NLI Models

Published: 24 May 2025, Last Modified: 24 May 2025CoNLL 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: OOD Generalization, NLI, Transformers, Linear Separability
Abstract: This study investigates the generalization abilities of discriminative transformers in Natural Language Inference (NLI) tasks, focusing on their tendency to rely on superficial features and dataset biases rather than genuine linguistic understanding. We argue that performance gaps between training and analysis datasets do not necessarily indicate a lack of knowledge but rather a misalignment between the decision boundaries of the classifier head and the representations learned by the encoder. By analyzing the representation space of NLI models on these datasets, we show that, despite poor accuracy based on final predictions, samples from opposing classes often remain linearly separable in the encoder's representation space. This suggests that the encoders possess sufficient knowledge to perform the NLI task effectively, despite the classifier head's challenges.
Submission Number: 206
Loading