InduCE: Inductive Counterfactual Explanations for Graph Neural Networks

Published: 11 May 2024, Last Modified: 11 May 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Graph neural networks (GNNs) drive several real-world applications including drug-discovery, recommendation engines, and chip designing. Unfortunately, GNNs are a black-box since they do not allow human-intelligible explanations of their predictions. Counterfactual reasoning is an effort to overcome this limitation. Specifically, the objective is to minimally perturb the input graph to a GNN, so that its prediction changes. While several algorithms have been proposed towards counterfactual explanations of GNNs, majority suffer from three key limitations: (1) they only consider perturbations in the form of deletions of existing edges, (2) they perform an inefficient exploration of the combinatorial search space, (3) the counterfactual explanation model is transductive in nature, i.e., they do not generalize to unseen data. In this work, we propose an inductive algorithm called InduCE, that overcomes these limitations. Through extensive experiments on graph datasets, we show that incorporating edge additions, and modelling marginal effect of perturbations aid in generating better counterfactuals among available recourse. Furthermore, inductive modeling enables InduCE to directly predict counterfactual perturbations without requiring instance-specific training. This leads to significant computational speed-up over baselines and allows counterfactual analyses for GNNs at scale.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url:
Changes Since Last Submission: Dear Action Editor, Thank you for your time and efforts in coordinating the review process of our submission. We are also uploading a camera-ready version addressing the minor concerns summarized in your decision statement. Here is the summary of the changes introduced in this minor revision is provided below. * We have provided a concrete example of why modeling marginal impact is important in `Sec 1`. The example depicted in `Fig. 2` showcases how zero-shot predictions may lead to inflated counterfactual sizes. We further note that Section 4.6 shows this intuition is not hypothetical. Specifically, if InduCE is run to make zero-shot predictions, like existing algorithms, there is a significant drop in performance. * We have added experiments to study the **(1)** impact of $\beta$, **(2)** number of hops ($h$) and the **(3)** the number of GAT layers on the performance of InduCE. The detailed analysis and experiments are presented in `Appendix K` with reference from `Sec 4.2` in the main paper. The key observations that emerge are as follows: * $\beta$ has minimal impact on the performance of InduCE. This can be attributed to the small sizes of counterfactuals in general in the benchmark datasets. Thus, the actual effect of $\beta$ may come into play for larger-sized counterfactuals when trajectories are longer, i.e., a larger set of edits are needed to change the class label of the target node. * The number of hops and the number of layers in GAT follow a similar trend. The performance is optimized when they are similar to the number of layers used in the black box GNN being explained. If they deviate from the number of layers in the black box GNN the performance drops. This drop is more enhanced in the inductive version since the explainer needs to generalize to unseen graphs.
Assigned Action Editor: ~Frederic_Sala1
Submission Number: 1810