Stability-Aware Training of Machine Learning Force Fields with Differentiable Boltzmann Estimators

Sanjeev Raja; Ishan Amin; Fabian Pedregosa; Aditi S. Krishnapriyan

Stability-Aware Training of Machine Learning Force Fields with Differentiable Boltzmann Estimators

Sanjeev Raja, Ishan Amin, Fabian Pedregosa, Aditi S. Krishnapriyan

Published: 25 Feb 2025, Last Modified: 25 Feb 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Machine learning force fields (MLFFs) are an attractive alternative to ab-initio methods for molecular dynamics (MD) simulations. However, they can produce unstable simulations, limiting their ability to model phenomena occurring over longer timescales and compromising the quality of estimated observables. To address these challenges, we present Stability-Aware Boltzmann Estimator (StABlE) Training, a multi-modal training procedure which leverages joint supervision from reference quantum-mechanical calculations and system observables. StABlE Training iteratively runs many MD simulations in parallel to seek out unstable regions, and corrects the instabilities via supervision with a reference observable. We achieve efficient end-to-end automatic differentiation through MD simulations using our Boltzmann Estimator, a generalization of implicit differentiation techniques to a broader class of stochastic algorithms. Unlike existing techniques based on active learning, our approach requires no additional ab-initio energy and forces calculations to correct instabilities. We demonstrate our methodology across organic molecules, tetrapeptides, and condensed phase systems, using three modern MLFF architectures. StABlE-trained models achieve significant improvements in simulation stability, data efficiency, and agreement with reference observables. Crucially, the stability improvements cannot be matched by simply reducing the simulation timestep, meaning that StABlE Training effectively allows for larger timesteps in MD simulations. By incorporating observables into the training process alongside first-principles calculations, StABlE Training can be viewed as a general semi-empirical framework applicable across MLFF architectures and systems. This makes it a powerful tool for training stable and accurate MLFFs, particularly in the absence of large reference datasets. Our code is publicly available at https://github.com/ASK-Berkeley/StABlE-Training.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Deanonymized, camera-ready version with link to code (added sentence to abstract)

Code: https://github.com/ASK-Berkeley/StABlE-Training

Supplementary Material: pdf

Assigned Action Editor: ~Efstratios_Gavves1

Submission Number: 3460

Loading