Track: Extended Abstract Track
Keywords: Representational geometry, Out of distribution generalization, Image classification
TL;DR: Representational geometric signatures from in-distribution data consistently predict failure in out-of-distribution generalization
Abstract: Generalization, the ability to perform well outside the training context, is a hallmark of both biological and artificial intelligence. A key challenge is to anticipate potential failure modes at evaluation time using only the information available at training time. In this work, we study image classification tasks where the image classes differ between training and evaluation, and ask whether failure in such out-of-distribution (OOD) generalization can be predicted by analyzing the representations (i.e., feature vectors) of in-distribution (ID) training data. Across architectures, network sizes, training parameters, optimization algorithms, and datasets, we find that conventional metrics fail to robustly predict OOD generalization, while task-relevant geometric signatures of ID representations strongly correlate with it. Specifically, networks tend to generalize poorly to new image classes when the dimensionality of ID object manifolds are more compressed in the feature space. Our results highlight representational geometry as a promising lens for mechanistic interpretability and robustness, with potential implications for comparing biological and artificial neural systems.
Submission Number: 16
Loading