Fast and Global Equivariant Attention

ICLR 2026 Conference Submission15391 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: equivariant machine learning, point cloud transformers
TL;DR: This work proposes a global linear time equivariant attention mechanism, which has asymptotically optimal O(N log N) token complexity.
Abstract: The global attention mechanism is one of the keys to the success of transformer architecture, but it incurs quadratic computational costs in relation to the number of tokens. On the other hand, equivariant models, which leverage the underlying geometric structures of problem instance, often achieve superior accuracy in physical, biochemical, computer vision, and robotic tasks, but at the cost of additional compute and memory requirments. As a result, existing equivariant transformers only support low-order equivariant features and local context windows, limiting their expressiveness and performance. This work proposes a global linear time equivariant attention, achieving efficient global attention by a novel Clebsch-Gordon Convolution on $\SO(3)$ irreducible representations, combined with a local dense attention. Our method enables equivariant modeling of features at all orders while achieving ${O}(N \log N)$ token complexity. Additionally, the proposed method scales well with high-order irreducible features, by exploiting the sparsity of the Clebsch-Gordon matrix. Lastly, we also incorporate optional token permutation equivariance through either weight sharing or data augmentation. We benchmark our method on a diverse set of benchmarks including QM9, n-body simulation, ModelNet point cloud classification, a geometric recall dataset and a robotic grasping dataset, showing clear gains over existing equivariant transformers in GPU memory size, speed, and accuracy.
Supplementary Material: zip
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 15391
Loading