The Few-Shot Unreliability of Molecular Foundation Models: A Geometric Diagnosis and Partial Remedy

Kevin Tirta Wijaya; Vahid Babaei

The Few-Shot Unreliability of Molecular Foundation Models: A Geometric Diagnosis and Partial Remedy

Kevin Tirta Wijaya, Vahid Babaei

Published: 30 May 2026, Last Modified: 30 May 2026ICML2026-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Track 1: Original Research/Position/Education/Attention Track

Keywords: representation learning, molecular foundation model

Abstract: Molecular foundation models (MFMs) are widely proposed as a solution to label scarcity in molecular property prediction, on the premise that pretraining on millions of unlabeled molecules produces representations that transfer to new endpoints with minimal supervision. We present an empirical study testing this premise on three MFM models across nine OpenADMET regression endpoints, with training size varying from $N=10$ to $N=1000$. We find that frozen MFM embeddings are uniformly worse than Morgan fingerprints at every training size, with the gap widening as the training size grows. A lightweight partial least squares (PLS) projection recovers most of this gap at small training size, revealing that ADMET-relevant directions exist in MFM embeddings but is hidden in directions that a regressor cannot discover from only a handful of labeled examples. Yet even with PLS, the deeper problem remains: at extreme label scarcity, no representation produces R$^2$ score above a naive train-mean predictor on any OpenADMET endpoint. Specifically, foundation model embeddings do not provide meaningful predictive signal above a naive train-mean baseline until $N \geq 100$, highlighting their unreliability in the label-scarce regimes where they are most needed.

Submission Number: 220

Loading