How does Uncertainty Impact Explanation Coherence?

Anonymous

How does Uncertainty Impact Explanation Coherence?

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone

TL;DR: By introducing noise into test data, we show that increased uncertainty at test time does not necessarily imply explanation incoherence.

Abstract: Explainable AI methods facilitate the understanding of model behaviour. However, small, imperceptible perturbations to inputs can vastly distort explanations. As these explanations are typically evaluated holistically, before model deployment, it is difficult to assess when a particular explanation is trustworthy. In contrast, uncertainty is easily measured at inference time and in an unsupervised fashion. Some studies have tried to create confidence estimators for explanations, but none have investigated an existing link between uncertainty and explanation quality. We artificially simulate epistemic uncertainty in text input by introducing noise at inference time. In this large-scale empirical study, we insert different levels of noise in a myriad of ways and measure the effect on PLM output and uncertainty metrics. We find that uncertainty and explanation coherence have a task-dependant correlation which can be moderately positive and likely stems from noise exposed during the training process; this suggests that these models may be better at identifying salient tokens when uncertain, which can be used for human-AI collaboration. While this quality can be at odds with robustness to noise, Integrated Gradients typically shows good robustness yet relatively strong correlation to uncertainty given perturbed data. This suggests that uncertainty is not only an indicator of output reliability, but could also be a potential indicator of explanation coherence.

Paper Type: long

Research Area: Interpretability and Analysis of Models for NLP

Contribution Types: Model analysis & interpretability

Languages Studied: English

0 Replies

Loading