TL;DR: We present a hierarchical transformer that uses ball tree partitioning to process physical data on irregular grids in linear time while capturing both local and global interactions.
Abstract: Large-scale physical systems defined on irregular grids pose significant scalability challenges for deep learning methods, especially in the presence of long-range interactions and multi-scale coupling. Traditional approaches that compute all pairwise interactions, such as attention, become computationally prohibitive as they scale quadratically with the number of nodes. We present Erwin, a hierarchical transformer inspired by methods from computational many-body physics, which combines the efficiency of tree-based algorithms with the expressivity of attention mechanisms. Erwin employs ball tree partitioning to organize computation, which enables linear-time attention by processing nodes in parallel within local neighborhoods of fixed size. Through progressive coarsening and refinement of the ball tree structure, complemented by a novel cross-ball interaction mechanism, it captures both fine-grained local details and global features. We demonstrate Erwin's effectiveness across multiple domains, including cosmology, molecular dynamics, and particle fluid dynamics, where it consistently outperforms baseline methods both in accuracy and computational efficiency.
Lay Summary: **Problem**: Scientists use computer simulations to study complex physical systems like weather patterns or molecular interactions. These simulations often involve thousands or millions of data points scattered irregularly in space, making them extremely slow to compute. Current AI methods struggle because they try to analyze every possible connection between data points, which becomes impossibly slow as systems grow larger.
**Solution**: We developed Erwin, a new AI system inspired by techniques from physics that efficiently handle large-scale calculations. Instead of analyzing all connections at once, Erwin organizes data points into a hierarchy of "neighborhoods." It focuses on nearby interactions within each neighborhood while capturing broader patterns through this hierarchical structure. This approach reduces computation time significantly, making it practical for real-world applications.
**Impact**: Erwin successfully handles massive simulations in cosmology, molecular dynamics, and fluid mechanics—achieving better accuracy while running much faster than existing methods. This breakthrough enables scientists to simulate larger, more realistic systems, potentially accelerating discoveries in climate science, drug development, and space research by making previously impossible calculations feasible.
Link To Code: https://github.com/maxxxzdn/erwin
Primary Area: Deep Learning
Keywords: hierarchical transformer, ball tree, physical simulations, tree-based methods, window attention, AI4Science
Submission Number: 3105
Loading