On the Impact of Sparsification on Quantitative Argumentative Explanations in Neural Networks

Published: 25 Oct 2025, Last Modified: 07 Apr 2026ArgXAI 2025: 3rd International Workshop on Argumentation for eXplainable AIEveryoneCC BY 4.0
Abstract: Neural Networks (NNs) are powerful decision-making tools, but their lack of explainability limits their use in high-stakes domains such as healthcare and criminal justice. The recent SpArX framework sparsifies NNs and maps them to (weighted) Quantitative Bipolar Argumentation Frameworks (QBAFs) to provide an argumentative understanding of their mechanics. QBAFs can be explained by various quantitative argumentative explanation methods such as Argument Attribution Explanations (AAEs), Relation Attribution Explanations (RAEs), and Contestability Explanations (CEs) - which assign numerical scores to arguments or relations to quantify their influence on the dialectical strength of an argument to be explained. However, it remains unexplored how sparsification of NNs impacts the explanations derived from the corresponding (weighted) QBAFs. In this paper we explore two directions for impact. First, we empirically investigate how varying the sparsification levels of NNs affects the preservation of these explanations: using four datasets (Iris, Diabetes, Cancer, and COMPAS), we find that AAEs are generally well preserved, whereas RAEs are not. Then, for CEs, we find that sparsification can improve computational efficiency in several cases. Overall, this study offers a preliminary investigation into the potential synergy between sparsification and explanation methods, opening up new avenues for future research.
Loading