Track: Extended Abstract Track
Keywords: Representational alignment, NeuroAI, Model-Brain Alignment, Model-to-Model Alignment, Alignment Perturbation
TL;DR: Training networks to misalign with oracle models, other reference networks, or brain responses has a performance cost.
Abstract: Several recent results suggest that brain-like computations emerge in Deep Neural Networks (DNNs) trained on naturalistic stimuli, leading to the hypothesis that shared computations between DNNs and brains arise because these representations are necessary for optimal performance. However, existing studies primarily demonstrate correlations between alignment and performance rather than establishing causality. We address this gap by proposing a representational perturbation framework that actively promotes or suppresses representational alignment during training with reference representations while maintaining task optimization. This allows us to test whether representational alignment is necessary for optimal performance or merely coincidental. We train over 60 large-scale vision models under varying alignment constraints, constructing Pareto-optimal curves that quantify the trade-off between representational alignment and task performance. Our results consistently show that models trained to minimize alignment with oracle theoretical models, pretrained networks, or brain responses achieve worse task performance than those trained to maximize alignment, providing the first causal evidence that representational alignment is functionally important rather than epiphenomenal.
Submission Number: 137
Loading