Fine-tuning is Not Enough: Rethinking Evaluation in Molecular Self-Supervised Learning

Fine-tuning is Not Enough: Rethinking Evaluation in Molecular Self-Supervised Learning

ICLR 2026 Conference Submission15763 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Self-Superviseds Learning, Molecule, Lineaer probing, Fine-tuning, Pretrain Gain, Paratemeter Shfit

TL;DR: We propose a unified evaluation framework for molecular SSL beyond fine-tuning

Abstract: Self-Supervised Learning (SSL) has shown great success in language and vision by using pretext tasks to learn representations without manual labels. Motivated by this, SSL has also emerged as a promising methodology in the molecular domain, which has unique challenges such as high sensitivity to subtle structural changes and scaffold splits, thereby requiring strong generalization ability. However, existing SSL-based approaches have been predominantly evaluated by naïve fine-tuning performance. For a more diagnostic analysis of generalizability beyond fine-tuning, we introduce a multi-perspective evaluation framework for molecular SSL under a unified experimental setting, varying only the pretraining strategies. We assess the quality of learned representations via linear probing on frozen encoders, measure Pretrain Gain by comparison against random initialization, quantify forgetting during fine-tuning, and explore scalability. Experimental results show that several models, surprisingly, exhibit low or even negative Pretrain Gain in linear probing. Graph neural network-based models experience substantial parameter shifts, and most models derive negligible benefits from larger pretraining datasets. Our reassessments offer new insights into the current landscape and challenges of molecular SSL.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 15763

Loading