Track: Extended Abstract Track
Keywords: representation, MLIP, representation alignment, foundation model
TL;DR: Scientific foundation models of different modalities, architectures, and training datasets are converging to a universal representation of matter.
Abstract: Scientific foundation models are rapidly emerging across physics, chemistry, and biology, yet it remains unclear whether they converge toward a shared representation of matter or remain governed by domain and modality. We analyze embeddings from nearly 60 models spanning molecules, materials, and proteins, using two complementary alignment metrics to probe their learned representations. We find modest cross-modality alignment for molecules and materials but strong alignment among protein models. We find that training dataset, rather than architecture, is the dominant factor shaping latent spaces. We see some hint of models converging into an optimal solution for the representation space, as nontrivial cross-modal alignment and strong alignment within modalities indicate. However, models align more strongly out-of-distribution than in-distribution, suggesting they remain data-limited and fall short of true foundation status. Our framework establishes representation alignment as a dynamic criterion for evaluating foundation-level generality in scientific models. This is an abbreviated, work-in-progress submission of our full manuscript, which will be linked in the comments below shortly.
Submission Number: 145
Loading