Brain–Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage

Angela Lopez-Cardona; Sebastian Idesis; Mireia Masias Bruns; Sergi Abadal; Ioannis Arapakis

Brain–Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage

Angela Lopez-Cardona, Sebastian Idesis, Mireia Masias Bruns, Sergi Abadal, Ioannis Arapakis

Published: 23 Sept 2025, Last Modified: 17 Nov 2025UniReps2025EveryoneRevisionsBibTeXCC BY 4.0

Track: Proceedings Track

Keywords: fMRI, brain alignment, intermediate layers, Platonic Hypothesis, LLMs, MLLMs

Abstract: Do brains and language models converge toward the same internal representations of the world? Recent years have seen a rise in studies of neural activations and model alignment. In this work, we review 25 fMRI-based studies published between 2023 and 2025 and explicitly confront their findings with two key hypotheses: (i) the Platonic Representation Hypothesis---that as models scale and improve, they converge to a representation of the real world, and (ii) the Intermediate-Layer Advantage---that intermediate (mid-depth) layers often encode richer, more generalizable features. Our findings provide converging evidence that models and brains may share abstract representational structures, supporting both hypotheses and motivating further research on brain–model alignment.

Submission Number: 30

Loading