Semi-Supervised Learning for Molecular Graphs via Ensemble Consensus

Rasmus Hannibal Tirsgaard; Laurits Fredsgaard; Marisa Wodrich; Mikkel Jordahn; Mikkel N. Schmidt

Semi-Supervised Learning for Molecular Graphs via Ensemble Consensus

Rasmus Hannibal Tirsgaard, Laurits Fredsgaard, Marisa Wodrich, Mikkel Jordahn, Mikkel N. Schmidt

11 Sept 2025 (modified: 24 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: semi-supervised learning, self-supervised learning, ensemble, gnn, graph neural networks, ensemble distilling, geometric graph neural network

Abstract: Machine learning is transforming molecular sciences by accelerating property prediction, simulation, and the discovery of new molecules and materials. Acquiring labeled data in these domains is often costly and time-consuming, whereas large collections of unlabeled molecular data are readily available. Standard semi-supervised learning methods often rely on label-preserving augmentations, which are challenging to design in the molecular domain, where minor changes can drastically alter properties. In this work, we show that semi-supervised methods that rely on an ensemble consensus can boost predictive accuracy across a diverse range of molecular datasets, task types, and graph neural network architectures. Notably, we show that training with an ensemble consensus objective results in an effect similar to knowledge distillation; an individual member of an ensemble trained this way often outperforms a full ensemble trained in a traditional supervised fashion. In addition, this type of semi-supervised training reduces calibration error and is robust over different datasets.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 4088

Loading