Class Explanations: the Role of Domain-Specific Content and Stop WordsDownload PDF

Published: 20 Mar 2023, Last Modified: 17 Apr 2023NoDaLiDa 2023Readers: Everyone
Keywords: explainable AI, global explanations, class explanations, LIME
TL;DR: We propose a new methodology based on aggregation and filtering to extracting informative class explanations for text classification
Abstract: We address two understudied areas related to explainability for neural text models. First, \emph{class explanations}. What features are descriptive across a class, rather than explaining single input instances? Second, the \emph{type of features} that are used for providing explanations. Does the explanation involve the statistical pattern of word usage or the presence of domain-specific content words? Here, we present a method to extract both class explanations and strategies to differentiate between two types of explanations -- domain-specific signals or statistical variations in frequencies of common words. We demonstrate our method using a case study in which we analyse transcripts of political debates in the Swedish Riksdag.
Student Paper: Yes, the first author is a student
3 Replies

Loading