Benchmarking Compositional generalisation for Learning Inter-atomic Potentials

Benchmarking Compositional generalisation for Learning Inter-atomic Potentials

ICLR 2026 Conference Submission17168 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: neural networks, Graph Neural Networks, Transformers, compositional generalization, benchmark tasks

Abstract: Inter-atomic potentials play an important role for modelling molecular dynamics. Unfortunately, traditional methods for computing such potentials are computationally heavy. In recent years, the idea of using neural networks to approximate these computations has gained in popularity, and a variety of Graph Neural Networks and Transformer based methods have been proposed for this purpose. Recent approaches provide highly accurate estimates, but they are typically trained and tested on the same molecules. It thus remains unclear whether these models mostly learn to interpolate the training labels, or whether their physically-informed designs actually allow them to capture the underlying principles. To address this gap, we propose a benchmark consisting of four tasks that each require some form of compositional generalisation. Training and testing involves separate molecules, but the training data is chosen such that generalisation to the test examples should be feasible for models that learn the physical principles. Our empirical analysis shows that the considered tasks are highly challenging for state-of-the-art models, with errors for out-of-distribution examples often being orders of magnitude higher than for in-distribution examples.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Submission Number: 17168

Loading