Keywords: Model-Level Explanations for GNNs, Theory of XAI, Evaluation of GNN Explainablity
Abstract: Model-level explanations for Graph Neural Networks (GNNs) aim to identify class-discriminative motifs that capture how a classifier recognizes a target class. Because the true motifs relied on by the classifier are unobservable, most approaches evaluate explanations by their target class score. However, class score alone is not sufficient as high-scoring explanations may be pathological or may fail to reflect the full range of motifs recognized by the classifier. To bridge this gap, this work introduces sufficiency risk as a formal criterion for whether explanations adequately represent the classifier’s reasoning, and derives distribution-free certificates that upper-bound this risk. Building on this foundation, three metrics are introduced: Coverage, Greedy Gain Area (GGA), and Overlap which operationalize the certificates to assess sufficiency, efficiency, and redundancy in explanations. To ensure practical utility, finite-sample concentration bounds are developed for these metrics, providing confidence intervals that enable statistically reliable comparison between explainers. Experiments on synthetic data and with three state-of-the-art explainers on four real-world datasets demonstrate that these metrics reveal differences in explanation quality hidden by class scores alone. Designed to complement class score, they constitute the first theoretically certified framework for evaluating model-level explanations of GNNs.
Primary Area: interpretability and explainable AI
Submission Number: 16040
Loading