TL;DR: FlashTP accelerates equivariant MLIPs by optimizing Tensor-Product operations, achieving up to speedup and significantly reducing memory footprint.
Abstract: Machine Learning Interatomic Potentials (MLIPs) enable efficient molecular dynamics (MD) simulations with high accuracy. While equivariant MLIPs achieve state-of-the-art accuracy, they face significant computational bottlenecks centered around their Tensor-Product layer, which account for up to 75\% of training time and cause substantial memory overhead. We present FlashTP, a highly optimized tensor-product library that addresses these inefficiencies through kernel fusion, sparse computation, and path-aggregated execution. FlashTP achieves up to 41.6$\times$ and 60.8$\times$ kernel speedups over _e3nn_ and NVIDIA cuEquivariance, respectively. For SevenNet-l3i5, it delivers 4.2$\times$ and 3.5$\times$ speedup while reducing peak memory usage by 6.3$\times$ and 6.2$\times$ for inference and training, respectively. The code is available at https://github.com/SNU-ARC/flashTP.
Lay Summary: Imagine watching a slow-motion movie of atoms as they jiggle, bump into each other, and form new structures. That’s what molecular dynamics (MD) simulations do on a computer—letting scientists see how materials behave or how proteins fold, without costly lab experiments
Recently, researchers have started using machine-learning interatomic potentials (MLIPs)—deep neural networks trained on high-precision quantum data—to make these simulations both faster and more accurate. However, MLIP-driven simulations are bottlenecked by a mathematical operation called the tensor product, which consumes approximately 75–90% of both computation time and memory.
We built FlashTP, an optimized GPU library that fuses all of those slow steps into one, removing redundant data movement and cleverly skipping work that isn’t needed. On modern hardware, FlashTP lets scientists train their models more than 3.5× faster, run simulations 4.2× faster, and use over 6× less memory compared to the popular MLIP framework _e3nn_. Best of all, it plugs right into the _e3nn_ framework, so you can switch on FlashTP with almost zero code changes and start seeing the speed boost immediately.
Link To Code: https://github.com/SNU-ARC/flashTP
Primary Area: Applications->Chemistry, Physics, and Earth Sciences
Keywords: Equivariant neural networks, Tensor Product, Software libraries, Efficiency, Machine-learned interatomic potential (MLIP), Machine Learning Force Fields (MLFF)
Submission Number: 450
Loading