The Loss Kernel: A Geometric Probe for Deep Learning Interpretability

The Loss Kernel: A Geometric Probe for Deep Learning Interpretability

ICLR 2026 Conference Submission20180 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Loss Landscape, Interpretability, Kernel, Influence Functions, Singular Learning Theory, Data Attribution, Geometry

TL;DR: The Loss Kernel measures functional similarity between inputs via loss covariance under SGLD-sampled weight perturbations, revealing semantic structure in ImageNet aligned with WordNet hierarchy.

Abstract: We introduce the loss kernel, an interpretability method for measuring similarity between data points according to a trained neural network. The kernel is the covariance matrix of per-sample losses computed under a distribution of low-loss-preserving parameter perturbations. We first validate our method on a synthetic multitask problem, showing it separates inputs by task as predicted by theory. We then apply this kernel to Inception-v1 to visualize the structure of ImageNet, and we show that the kernel's structure aligns with the WordNet semantic hierarchy. This establishes the loss kernel as a practical tool for interpretability and data attribution.

Primary Area: interpretability and explainable AI

Submission Number: 20180

Loading