Keywords: representational alignment, representational similarity, explainability, interpretability, comparison, clustering
TL;DR: a method to visually, interpretably compare two neural networks.
Abstract: Current computer vision models are approaching superhuman performance on visual categorization tasks in domains such as ecology, radiology, etc.
Explainable AI (XAI) methods aim to explain how such models make decisions. Unfortunately, in order to make explanations that are human-friendly, XAI methods can often simplify model behavior to the point that critical information is lost. For humans to learn how model's achieve superhuman performance, we must work towards understanding these nuances. In this work, we consider the challenging task of visually explaining the differences between two representations. By nature, this task forces XAI methods to discard coarse-grained, obvious aspects of a model's representation to focus on nuances that make a model unique. To this end, we propose a clustering method that is able to isolate neighborhoods of images that are close together in one representation, but distant in the other. These discovered clusters represent concepts that are present in only one of the two representations.
We use our method to compare different model representations and discover semantically meaningful clusters.
Submission Type: Short Paper (4 Pages)
Archival Option: This is a non-archival submission
Presentation Venue Preference: ICLR 2025
Submission Number: 26
Loading