Keywords: deep neural network, weights similarity, model interpretation, computater vision
TL;DR: We propose a new model similarity metric that overcomes the calibration weaknesses of current measures and provides greater quality prediction of functional similarity
Abstract: Deep learning approaches have revolutionized artificial intelligence, but model opacity and fragility remain significant challenges. The reason for these challenges, we believe, is a knowledge gap at the heart of the field --- the lack of well-calibrated metrics quantifying the similarity of the internal representations of models obtained using different architectures, training strategies, different checkpoints, or under different random initializations. While several metrics have been proposed, they are poorly calibrated and susceptible to manipulations and confounding factors, as well as being computationally intensive when probed with a large and diverse set of test samples. We report here an integration of chain normalization of weights and centered kernel alignment that, by focusing on weight similarity instead of activation similarity, overcomes most of the limitations of existing metrics. Our approach is sample-agnostic, symmetric in weight space, computationally efficient, and well-calibrated.
Primary Area: interpretability and explainable AI
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8223
Loading