Less is More: Improving Molecular Force Fields with Minimal Temporal Information

ICLR 2026 Conference Submission25585 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Molecular prediction, AI for Science, graph neural networks, computational physics, Temporal information
TL;DR: We show that using an auxiliary loss on just two consecutive molecular dynamics frames is an optimal and counter-intuitive strategy for significantly improving the accuracy of neural network
Abstract: Accurate prediction of energy and forces for 3D molecular systems is one of fundamental challenges at the core of AI for Science applications. Many powerful and data-efficient neural networks predict molecular energies and forces from single atomic configurations. However, one crucial aspect of the data generation process is rarely considered while learning these models i.e. Molecular Dynamics (MD) simulation. MD generates trajectories of atomic positions of molecular systems moving from higher energy states to lower energy stable/equilibrium states. This work explores a novel way to leverage molecular dynamics (MD) data, when available, to improve the performance of such predictors. We introduce a novel auxiliary loss function that uses the temporal relationships within MD trajectories, called FRAMES. Counter-intuitively, we demonstrate that minimal temporal information, captured by pairs of just two consecutive frames, is optimal for this task, while using longer trajectory sequences can introduce redundancy and degrade performance. The auxiliary loss operates on pairs of consecutive frames, encouraging the model to inherently learn physically meaningful relations/correspondences between the configuration and the forces. During test time, the model predicts energy/forces only from the current configurations. On the widely used MD17 and ISO17 benchmarks, FRAMES significantly outperforms its Equiformer baseline, achieving highly competitive results in both energy and force accuracy. Our work not only presents a novel training strategy which improves the accuracy of the model, but also provides evidence that for distilling physical priors of atomic systems, more temporal data is not always better.
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 25585
Loading