EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations

Vaibhav Bihani; Utkarsh Pratiush; Sajid Mannan; Tao Du; Zhimin Chen; Santiago Miret; Matthieu Micoulaut; Morten M Smedskjaer; Sayan Ranu; N M Anoop Krishnan

EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations

Vaibhav Bihani, Utkarsh Pratiush, Sajid Mannan, Tao Du, Zhimin Chen, Santiago Miret, Matthieu Micoulaut, Morten M Smedskjaer, Sayan Ranu, N M Anoop Krishnan

Published: 27 Oct 2023, Last Modified: 11 Dec 2023AI4Mat-2023 SpotlightEveryoneRevisionsBibTeX

Submission Track: Papers

Submission Category: AI-Guided Design

Keywords: Graph neural network, equivariant neural network, atomistic simulations, molecular dynamics

TL;DR: Here, we benchmark six equivariant graph neural network force fields, through existing and new datasets and tasks, for modeling atomic system.

Abstract: Equivariant graph neural networks force fields (EGraFFs) have shown great promise in modelling complex interactions in atomic systems by exploiting the graphs’ inherent symmetries. Recent works have led to a surge in the development of novel architectures that incorporate equivariance-based inductive biases alongside architectural innovations like graph transformers and message passing to model atomic interactions. However, thorough evaluations of these deploying EGraFFs for the downstream task of real-world atomistic simulations, is lacking. To this end, here we perform a systematic benchmarking of 6 EGraFF algorithms (NequIP, Allegro, BOTNet, MACE, Equiformer, TorchMDNet), with the aim of understanding their capabilities and limitations for realistic atomistic simulations. In addition to our thorough evaluation and analysis on eight existing datasets based on the benchmarking literature, we release two new benchmark datasets, propose four new metrics, and three challenging tasks. The new datasets and tasks evaluate the performance of EGraFF to out-of-distribution data, in terms of different crystal structures, temperatures, and new molecules. Interestingly, evaluation of the EGraFF models based on dynamic simulations reveals that having a lower error on energy or force does not guarantee stable or reliable simulation or faithful replication of the atomic structures. Moreover, we find that no model clearly outperforms other models on all datasets and tasks. Importantly, we show that the performance of all the models on out-of-distribution datasets is unreliable, pointing to the need for the development of a foundation model for force fields that can be used in real-world simulations. In summary, this work establishes a rigorous framework for evaluating machine learning force fields in the context of atomic simulations and points to open research challenges within this domain.

Digital Discovery Special Issue: Yes

Submission Number: 53

Loading