Causes and Consequences of Representational Similarity in Machine Learning Models

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: representational similarity, interpretable machine learning
TL;DR: We perform the first empirical evaluation of causes of representational similarity in machine learning models, exploring the relative effects of task and dataset overlap and the downstream consequences of representational similarity.
Abstract: Numerous works have noted similarities in how machine learning models represent the world, even across modalities. Although much effort has been devoted to uncovering properties and metrics on which these models align, surprisingly little work has explored causes of this similarity. To advance this line of inquiry, this work explores how two factors—dataset overlap and task overlap—influence downstream model similarity. We evaluate the effects of both factors through experiments across model sizes and modalities, from small classifiers to large language models. We find that generally, both task and dataset overlap cause higher representational similarity. Finally, we consider downstream consequences of representational similarity, demonstrating how greater similarity increases vulnerability to transferable adversarial attacks.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 13149
Loading