EquiformerV3: Scaling Efficient, Expressive and General SE(3)-Equivariant Graph Attention Transformers

18 Jan 2026 (modified: 24 Jun 2026)Submitted to ICML 2026EveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We improve the efficiency, expressivity and generality of SE(3)-equivariant graph Transformers and achieves state-of-the-art results on OC20, OMat24 and Matbench under the settings of direct forces and energy-conserving simulations.
Abstract: As SE(3)-equivariant graph neural networks mature as a core tool for 3D atomistic modeling, improving their efficiency, expressivity, and physical consistency has become a central challenge for large-scale applications. In this work, we introduce EquiformerV3, the third generation of the SE(3)-equivariant graph attention Transformer, designed to advance all three dimensions: efficiency, expressivity, and generality. Building on EquiformerV2, we have the following three key advances. First, we optimize the software implementation, achieving 1.75× speedup. Second, we introduce simple and effective modifications to EquiformerV2, including equivariant merged layer normalization, improved feedforward network hyper-parameters, and attention with smooth radius cutoff. Third, we propose SwiGLU-$S^2$ activations to incorporate many-body interactions for better theoretical expressivity and to preserve strict equivariance while reducing the complexity of sampling $S^2$ grids. Together, SwiGLU-$S^2$ activations and smooth-cutoff attention enable accurate modeling of smoothly varying potential energy surfaces (PES), generalizing EquiformerV3 to tasks requiring energy-conserving simulations and higher-order derivatives of PES. With these improvements, EquiformerV3 achieves state-of-the-art results on OC20, OMat24, and Matbench Discovery.
Primary Area: Applications->Chemistry, Physics, and Earth Sciences
Keywords: equivariant neural networks, graph neural networks, computational physics, transformer networks
Submission Number: 7295
Loading