Graph Inverse Style Transfer for Counterfactual Explainability

Bardh Prenkaj; Efstratios Zaradoukas; Gjergji Kasneci

Graph Inverse Style Transfer for Counterfactual Explainability

Bardh Prenkaj, Efstratios Zaradoukas, Gjergji Kasneci

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: Graph Inverse Style Transfer (GIST) pioneers the first-ever backtracking framework for graph counterfactuals, combining spectral style transfer with backward refinement to achieve realism and validity of the produced counterfactuals..

Abstract: Counterfactual explainability seeks to uncover model decisions by identifying minimal changes to the input that alter the predicted outcome. This task becomes particularly challenging for graph data due to preserving structural integrity and semantic meaning. Unlike prior approaches that rely on forward perturbation mechanisms, we introduce Graph Inverse Style Transfer (GIST), the first framework to re-imagine graph counterfactual generation as a backtracking process, leveraging spectral style transfer. By aligning the global structure with the original input spectrum and preserving local content faithfulness, GIST produces valid counterfactuals as interpolations between the input style and counterfactual content. Tested on 8 binary and multi-class graph classification benchmarks, GIST achieves a remarkable +7.6% improvement in the validity of produced counterfactuals and significant gains (+45.5%) in faithfully explaining the true class distribution. Additionally, GIST's backtracking mechanism effectively mitigates overshooting the underlying predictor's decision boundary, minimizing the spectral differences between the input and the counterfactuals. These results challenge traditional forward perturbation methods, offering a novel perspective that advances graph explainability.

Lay Summary: Modern AI systems, like those used in medical diagnoses or fraud detection, often rely on complex graph structures to make predictions. However, these systems can feel like black boxes - it’s hard to understand why they make one decision over another. To make AI more transparent, researchers use counterfactual explanations: they answer questions like, *“What small change would have led the model to make a different decision?”* Explaining decisions on graphs is uniquely challenging because graphs are more than just data points - they have structure and relationships that need to be preserved. Traditional approaches try to “nudge” the graph forward into a new decision outcome, but this can break its structure or lose important information. Our research introduces GIST (Graph Inverse Style Transfer), a new way to explain AI decisions on graphs by thinking in *reverse*. Instead of pushing a graph forward until the AI flips its decision, we first overshoot and then carefully step backward to find a meaningful alternative. Inspired by how style transfer works in image editing, GIST uses the "style" of a graph - its overall structure - and combines it with local details to produce realistic and faithful explanations. GIST outperforms previous methods by a large margin, generating explanations that are not only valid (they change the AI’s decision) but also make sense structurally and contextually. This backtracking strategy is especially useful when the AI model is hard to access or can’t be queried frequently - a common real-world constraint. By making graph-based AI more understandable, GIST helps bring trustworthy and interpretable machine learning closer to sensitive applications like healthcare, finance, and law.

Link To Code: https://github.com/bardhprenkaj/gist

Primary Area: Social Aspects->Accountability, Transparency, and Interpretability

Keywords: Counterfactual Explanations, Style Transfer, Backtracking, Graph Structures, Explainable AI

Submission Number: 6999

Loading