Brain–Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage
Track: Proceedings Track
Keywords: fMRI, brain alignment, intermediate layers, Platonic Hypothesis, LLMs, MLLMs
Abstract: Do brains and language models converge toward the same internal representations of the world? Recent years have seen a rise in studies of neural activations and model alignment. In this work, we review 25 fMRI-based studies published between 2023 and 2025 and explicitly confront their findings with two key hypotheses: (i) the Platonic Representation Hypothesis---that as models scale and improve, they converge to a representation of the real world, and (ii) the Intermediate-Layer Advantage---that intermediate (mid-depth) layers often encode richer, more generalizable features. Our findings provide converging evidence that models and brains may share abstract representational structures, supporting both hypotheses and motivating further research on brain–model alignment.
Submission Number: 30
Loading