Exploring Pointwise Similarity of Representations

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: deep learning, representation learning, representation similarity, interpretability
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We show that many intriguing phenomena in deep learning can be understood by measuring the similarity of representations at the level of individual data points.
Abstract: Representation similarity measures have emerged as a popular tool for examining learned representations. Many existing studies have focused on analyzing aggregate estimates of similarity at a global level, i.e. over a set of representations for N input examples. In this work, we shed light on the importance of investigating similarity of representations at a local level, i.e. representations of a single input example. We show that peering through the lens of similarity of individual data points can reveal previously overlooked phenomena in deep learning. Specifically, we investigate the similarity in learned representations of inputs by architecturally identical models that only differ in random initialization. We find that while standard models represent (most) inputs similarly only when they are drawn from training data distribution, adversarially trained models represent a wide variety of out-of-distribution inputs similarly, thus indicating that these models learn more "stable" representations. We design an instantiation of such a pointwise measure, named Pointwise Normalized Kernel Alignment (PNKA), that provides a way to quantify the similarity of an individual point across distinct representation spaces. Using PNKA, we additionally show how we can further understand the effects of data (e.g. corruptions) and model (e.g. fairness constraints) interventions on the model's representations.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7209
Loading