Keywords: Explainable AI, Multi-Neuron Explanations, Model Interpretation
TL;DR: Neurobasket organizes neurons into semantic baskets, enabling structured and interpretable multi-neuron explanations.
Abstract: Deep neural networks excel across domains, yet their internal representations remain opaque. Prior approaches based on single neurons or non-hierarchical groups are limited by the distributed nature of concept encoding. We introduce Neurobasket, a framework that constructs semantically coherent multi-neuron groups through hierarchical clustering and natural language grounding. Neurobaskets enable set-theoretic reasoning, with unions revealing shared abstractions and differences highlighting discriminative cues. Experiments across convolutional and transformer models, trained on diverse datasets, show that neurobaskets yield stable and semantically aligned sets, while capturing prediction-relevant pathways. Qualitative visualizations further showed that grouped neurons correspond to coherent and localized concepts. Overall, Neurobasket provides a structured and compositional view of neural representations, extending beyond unit-centric or non-hierarchical explanations.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 10664
Loading