A recipe for scalable attention-based ML Potentials: unlocking long-range accuracy with all-to-all node attention
Keywords: machine learning potentials;AI for Science
Abstract: Machine-learning interatomic potentials (MLIPs) have advanced rapidly, with many top models relying on strong physics-based inductive bias, including rotational equivariance, high-order directional features, and energy conservation. However, as these MLIP models are being trained and evaluated on larger and larger systems, such as biomolecules and electrolytes, it is increasingly clear that solutions are needed for scalable and accurate approaches to long-range (LR) interactions in large systems. The most common approaches in literature to address long-range interactions rely on adding explicit physics-based inductive biases into the model. In this work, we propose a conceptually straightforward, data-driven, attention-based, and energy conserving MLIP, AllScAIP, that addresses long-range interactions
and scales to O(100 million) training set sizes: a stack of local neighborhood self-attention followed by all-to-all node attention for global interactions across an entire atomistic system. Extensive ablations across model and dataset scales reveal a consistent picture: in low-data/small-model regimes, inductive biases help with improving some sample efficiency, and the all-to-all node attention increases LR accuracy. As data and parameters scale, the marginal benefit of these inductive biases diminishes (and can even reverse), while the all-to-all node attention remains the most durable ingredient for learning LR interactions. Our model achieves state-of-the-art on both energy/force accuracy and relevant physics-based evaluations on a representative molecular dataset (OMol25), while being competitive on materials (OMat24), and catalyst (OC20) datasets.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 14611
Loading