Class Explanations: the Role of Domain-Specific Content and Stop Words

Denitsa Saynova; Bastiaan Bruinsma; Moa Johansson; Richard Johansson

Class Explanations: the Role of Domain-Specific Content and Stop Words

Denitsa Saynova, Bastiaan Bruinsma, Moa Johansson, Richard Johansson

Published: 20 Mar 2023, Last Modified: 17 Apr 2023NoDaLiDa 2023Readers: Everyone

Keywords: explainable AI, global explanations, class explanations, LIME

TL;DR: We propose a new methodology based on aggregation and filtering to extracting informative class explanations for text classification

Abstract: We address two understudied areas related to explainability for neural text models. First, \emph{class explanations}. What features are descriptive across a class, rather than explaining single input instances? Second, the \emph{type of features} that are used for providing explanations. Does the explanation involve the statistical pattern of word usage or the presence of domain-specific content words? Here, we present a method to extract both class explanations and strategies to differentiate between two types of explanations -- domain-specific signals or statistical variations in frequencies of common words. We demonstrate our method using a case study in which we analyse transcripts of political debates in the Swedish Riksdag.

Student Paper: Yes, the first author is a student

3 Replies

Loading