What is Missing? Explaining Neurons Activated by Absent Concepts

What is Missing? Explaining Neurons Activated by Absent Concepts

ICLR 2026 Conference Submission17180 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Learning, Explainable Artificial Intelligence, Computer Vision

TL;DR: Some concepts are absent from the input yet still influence the output, and current XAI methods cannot account for such encoded absences.

Abstract: Explainable artificial intelligence (XAI) aims to provide human-interpretable insights into the behavior of deep neural networks (DNNs), typically by estimating a simplified causal structure of the model. In existing work, this causal structure most often includes relationships where the presence of an input pattern or latent feature is associated with a strong activation of a neuron. For example, attribution methods identify input pixels that contribute most to a prediction, and feature visualization methods reveal inputs that cause high activation of a target neuron — both implicitly assuming that neurons encode the presence of concepts. However, a largely overlooked type of causal relationship is that of *encoded absences*, where the absence of a concept increases activation, or vice versa, the presence of a concept inhibits activation. In this work, we show that such inhibitory relationships are common and that standard XAI methods fail to reveal them. To address this, we propose two extensions to attribution and feature visualization techniques that uncover encoded absences. Across experiments, we show that standard XAI methods fail to explain encoded absences, illustrate how they can be revealed, how ImageNet models exploit them, and that debiasing can be improved when considering them.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 17180

Loading