LMExplainer: A Knowledge-Enhanced Explainer for Language ModelsDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: LMExplainer revolutionizes transparency in (large) LMs using a more transparent graph surrogate, providing superior, human-understandable explanations and outperforming existing models on major datasets.
Abstract: Language models (LMs) like GPT-4, are adept in tasks ranging from text generation to question answering. However, their decision process lacks of transparency due to complex model structures and millions of parameters. This hinders user trust on LMs, especially in safety-critical applications. Due to the opaque nature of LMs, a promising approach for explaining how they work is by generating explanations on a more transparent surrogate (e.g., a knowledge graph (KG)). Such works mostly exploit attention weights to provide explanations for LM recommendations. However, pure attention-based explanations lack scalability to keep up with the growing complexity of LMs. To bridge this important gap, we proposeLMExplainer, a knowledge-enhanced explainer for LMs capable of providing human-understandable explanations. It is designed to efficiently locate the most relevant knowledge within a large-scale KG via the graph attention neural network (GAT) to extract key decision signals reflecting how a given LM works. Extensive experiments comparingLMExplainer against eight state-of-the-art baselines show that it outperforms existing LM+KG methods and large LMs (LLMs) on the CommonsenseQA and OpenBookQA datasets. We compare the explanation generated byLMExplainer with other algorithm-generated explanations as well as human-annotated explanations. The results show thatLMExplainer generates more comprehensive and clearer explanations.
Paper Type: long
Research Area: Interpretability and Analysis of Models for NLP
Contribution Types: Model analysis & interpretability
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview