TopEx: Topic-based Explanations for Model ComparisonDownload PDF

01 Mar 2023 (modified: 01 Jun 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone
Keywords: Explainability, Topic Modeling, NLP, Language Models, Feature Attribution
TL;DR: In order to meaningfully explain and compare LMs, we propose TopEx -- a topic-based explanation method that uses topic modeling to condense feature attributions into a model-independent explanation.
Abstract: Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between DistilRoBERTa and GPT-2 on a variety of NLP tasks.
5 Replies

Loading