From Evaluation to Design: Using Potential Energy Surface Smoothness Metrics to Guide MLIP Architectures

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Machine Learning Interatomic Potential, Physical Soundness, Potential Energy Surface, Smoothness, Scaling, Differentiable k-Nearest-Neighbor
TL;DR: BSCT is an explicit and efficient test of potential energy surface (PES) smoothness both near-equilibrium and far-from-equilibrium, enabling the design of models that are both physically reliable and state-of-the-art accurate.
Abstract: The reliability of machine learning interatomic potentials (MLIPs) in downstream physics tasks depends not only on reproducing reference energies and forces, but also on the smoothness of the underlying potential energy surface (PES). While prior work has evaluated smoothness indirectly—most commonly by running microcanonical molecular dynamics (MD) simulations or calculating phonon modes—such tests capture only near-equilibrium smoothness and are computationally expensive. We introduce the Bond Smoothness Characterization Test (BSCT), a simple and inexpensive benchmark that directly quantifies PES smoothness both near- and far-from-equilibrium by probing controlled bond deformations. Since BSCT measures the PES itself, it can detect a wide range of instabilities, such as discontinuities, artificial minima, or spuriously large forces. To investigate how BSCT can guide the design of scalable, physically reliable MLIPs, we start from an unconstrained Swin-Transformer-inspired backbone and conduct a controlled study on the SPICE (molecules) and MPTrj (materials) datasets. Beginning with this baseline, we introduce targeted design changes—differentiable k-nearest neighbor graphs, temperature-controlled attention, and broadened radial smearing widths. At each step, we measure the energy and forces accuracy, energy conservation in microcanonical simulations, and the BSCT metric. Our results show that BSCT improvements consistently predict reductions in MD instabilities and enable early-stage filtering of problematic models. The final BSCT-guided models achieve state-of-the-art accuracy on SPICE and MPTrj while maintaining excellent smoothness, demonstrating that optimizing for physical soundness via BSCT naturally yields high performance. Our results position BSCT as a practical, general-purpose metric for guiding the design of reliable MLIPs.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 23221
Loading