Beyond Topological Self-Explainable GNNs: A Formal Explainability Perspective

Steve Azzolin; SAGAR MALHOTRA; Andrea Passerini; Stefano Teso

Beyond Topological Self-Explainable GNNs: A Formal Explainability Perspective

Steve Azzolin, SAGAR MALHOTRA, Andrea Passerini, Stefano Teso

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-SA 4.0

TL;DR: We formalize explanations extracted Self-Explainable GNNs, highlighting their links to existing definitions and their strengths and limitations. To address some of these limitations while preserving their benefits, we propose a simple model.

Abstract: Self-Explainable Graph Neural Networks (SE-GNNs) are popular explainable-by-design GNNs, but their explanations' properties and limitations are not well understood. Our first contribution fills this gap by formalizing the explanations extracted by some popular SE-GNNs, referred to as Minimal Explanations (MEs), and comparing them to established notions of explanations, namely Prime Implicant (PI) and faithful explanations. Our analysis reveals that MEs match PI explanations for a restricted but significant family of tasks. In general, however, they can be less informative than PI explanations and are surprisingly misaligned with widely accepted notions of faithfulness. Although faithful and PI explanations are informative, they are intractable to find and we show that they can be prohibitively large. Given these observations, a natural choice is to augment SE-GNNs with alternative modalities of explanations taking care of SE-GNNs’ limitations. To this end, we propose Dual-Channel GNNs that integrate a white-box rule extractor and a standard SE-GNN, adaptively combining both channels. Our experiments show that even a simple instantiation of Dual-Channel GNNs can recover succinct rules and perform on par or better than widely used SE-GNNs.

Lay Summary: Graph Neural Networks (GNNs) are a type of AI model that can analyze and make predictions about data that’s best represented as a network—like social networks or molecules. Some special GNNs, called Self-Explainable GNNs (SE-GNNs), are designed not just to make predictions, but to also explain why they made those predictions by pointing to parts of the input network that were most important. This paper looks closely at how good those explanations really are, and whether highlighting parts of the input is always the most sensible type of guidance to provide. While SE-GNNs often give simple and compact explanations, these explanations can sometimes be incomplete or misleading. For example, they may fail to show all the reasons behind a prediction or might give the same explanation for different reasons. We then investigate an alternative design choice, combining the usual explanation method with a second, simpler rule-based system. Together, the two parts help the model decide when to use basic patterns (like simple rules about node features) and when to focus on more complex network structures. Our findings serve as sensible guidance to practitioners in knowing the limits of SE-GNNs in providing explanations, and as a warning for stakeholders from blindly trusting explanations produced by these models.

Link To Code: https://github.com/steveazzolin/beyond-topo-segnns

Primary Area: Deep Learning->Graph Neural Networks

Keywords: GNNs, Interpretability, Self-Explainable models, Formal Explainability, Self-Explainable GNNs

Submission Number: 7549

Loading