Keywords: brain alignment, benchmarking, representational similarity analysis, video models
TL;DR: We expose the limits of brain alignment of SOTA video models, and propose a framework based on cross-region alignment patterns in the brain towards more robust and meaningful assessment of brain-model alignment.
Abstract: Neuroscientists and computer vision researchers use model–brain alignment benchmarks to compare artificial and biological vision systems. These benchmarks rank models according to alignment measures such as the similarity of representational geometry or the predictivity of neural responses from model activations. However, recent works have raised a number of problems with these rankings, most critically their lack of discriminative power, raising the conceptual question of what it means for a model to be ''brain-aligned''.
Here we introduce *alignment patterns* - characteristic functional relationship profiles of each brain region to all others - and propose that models should reproduce these patterns to qualify as brain-aligned.
First, we apply a standard benchmarking pipeline to a broad spectrum of vision models on the BOLD Moments video fMRI dataset across visual regions of interest (ROIs).
We find diverse models appear *equivalent* in their brain alignment, reflecting the lack of discriminative power of conventional alignment benchmarks.
Conventional alignment evaluation is a pointwise similarity test: it assesses whether a model is aligned to an individual ROI. It is therefore sensitive to the specific invariances and scaling properties of the chosen metric. In contrast, *alignment pattern analysis (APA)* is a second-order *structural consistency* test: a model aligned to a given ROI should reproduce that ROI’s characteristic cross-region alignment profile.
Applying this test, we find that, while these patterns are highly stable across brains of different subjects, even top-ranked models often fail to capture them. Notably, models that appear effectively equivalent in alignment diverge sharply under the relational criterion, demonstrating the added discriminative value of APA.
Finally, we argue for a clearer distinction between the criteria a model must meet to serve as a tool versus as a computational model. Conventional alignment measures may be sufficient for identifying neurally predictive models, but claims about computational or algorithmic similarity may require a stronger basis of evidence, including the reproducibility of relational alignment patterns.
Primary Area: applications to neuroscience & cognitive science
Submission Number: 13969
Loading