Accurate, yet Inconsistent? Consistency Analysis on Language Models

Anonymous

Accurate, yet Inconsistent? Consistency Analysis on Language Models

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone

Abstract: Consistency, which refers to generating the same predictions for semantically similar contexts, is highly desirable for a sound language model. Although recent pre-trained language models (PLMs) deliver an outstanding performance in various downstream tasks, they should also exhibit a consistent behaviour, given that the models truly understand language. In this paper, we propose a simple framework, called consistency analysis on language models (CALM), to evaluate a model's lower-bound consistency ability. Via experiments, we confirm that current PLMs are prone to generate inconsistent predictions even for semantically identical inputs with high confidence. We also observe that multi-task training is of benefit to improve consistency, increasing the value by 17% on average.

0 Replies

Loading