Keywords: explainable AI, interpretability, CNN, layer-wise analysis, representation learning, hidden-layer attribution
TL;DR: We introduce a new way to interpret neural networks by decomposing hidden layers into coherent functional units using complexity analysis, offering deep descriptive insights beyond standard attribution methods.
Abstract: The growing deployment of artificial intelligence (AI) systems in safety-critical domains has underscored the need for transparent and trustworthy models. While existing explainability methods primarily focus on end-to-end interpretations, they often fall short of revealing the internal processing dynamics of deep networks. In this paper, we introduce CRISP a novel approach that decomposes neural networks into interpretable subprocesses, enabling a layer-wise analysis of hidden representations. Our method constructs interactive, low-complexity representations of input-output transformations within hidden layers, facilitating a deeper understanding of network behavior beyond final predictions. We present a framework and empirical validation for Convolutional Neural Networks (CNNs), demonstrating the method’s potential to support more fine-grained, process-level insights into model operation.
Primary Area: interpretability and explainable AI
Submission Number: 10907
Loading