TL;DR: Geometric Hyena Networks is the first equivariant long-convolutional model that efficiently captures global geometric context at sub-quadratic complexity
Abstract: Processing global geometric context while preserving equivariance is crucial when modeling biological, chemical, and physical systems. Yet, this is challenging due to the computational demands of equivariance and global context at scale. Standard methods such as equivariant self-attention suffer from quadratic complexity, while local methods such as distance-based message passing sacrifice global information. Inspired by the recent success of state-space and long-convolutional models, we introduce Geometric Hyena, the first equivariant long-convolutional model for geometric systems. Geometric Hyena captures global geometric context at sub-quadratic complexity while maintaining equivariance to rotations and translations. Evaluated on all-atom property prediction of large RNA molecules and full protein molecular dynamics, Geometric Hyena outperforms existing equivariant models while requiring significantly less memory and compute that equivariant self-attention. Notably, our model processes the geometric context of $30k$ tokens $20 \times$ faster than the equivariant transformer and allows $72 \times$ longer context within the same budget.
Lay Summary: Many scientific problems such as predicting the properties or structures of proteins and RNAs require understanding the organization of geometric systems. To be robust, models must respect physical symmetries like rotation and translation. However, standard models that support processing such symmetries either consume too much memory or struggle to capture long-range relationships.
To address this, we developed Geometric Hyena, a new model designed to efficiently handle large and complex geometric data. It uses a method called long convolution to capture how different parts of a molecule relate to each other, even over great distances, while keeping computation fast and memory use low.
Geometric Hyena outperforms existing methods in predicting how RNA degrades and how proteins move, tasks with real-world impact in drug discovery and biological research. Notably, our model can process much longer geometric systems—millions of input tokens—on a single standard GPU. This makes it a powerful and accessible tool for researchers studying large geometric structures such as biomolecules.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: Deep Learning->Graph Neural Networks
Keywords: equivariance, global context, long convolution, scalability, mechanistic interpretability, architecrture
Submission Number: 339
Loading