Scalable Hierarchical Self-Attention with Learnable Hierarchy for Long-Range Interactions

Thuan Nguyen Anh Trang; Khang Nhat Ngo; Hugo Sonnery; Thieu Vo; Siamak Ravanbakhsh; Truong Son Hy

Scalable Hierarchical Self-Attention with Learnable Hierarchy for Long-Range Interactions

Thuan Nguyen Anh Trang, Khang Nhat Ngo, Hugo Sonnery, Thieu Vo, Siamak Ravanbakhsh, Truong Son Hy

Published: 11 Apr 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Self-attention models have made great strides toward accurately modeling a wide array of data modalities, including, more recently, graph-structured data. This paper demonstrates that adaptive hierarchical attention can go a long way toward successfully applying transformers to graphs. Our proposed model Sequoia provides a powerful inductive bias towards long-range interaction modeling, leading to better generalization. We propose an end-to-end mechanism for a data-dependent construction of a hierarchy which in turn guides the self-attention mechanism. Using adaptive hierarchy provides a natural pathway toward sparse attention by constraining node-to-node interactions with the immediate family of each node in the hierarchy (e.g., parent, children, and siblings). This in turn dramatically reduces the computational complexity of a self-attention layer from quadratic to log-linear in terms of the input size while maintaining or sometimes even surpassing the standard transformer's ability to model long-range dependencies across the entire input. Experimentally, we report state-of-the-art performance on long-range graph benchmarks while remaining computationally efficient. Moving beyond graphs, we also display competitive performance on long-range sequence modeling, point-clouds classification, and segmentation when using a fixed hierarchy. Our source code is publicly available at https://github.com/HySonLab/HierAttention

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: N/A

Code: https://github.com/HySonLab/HierAttention

Assigned Action Editor: ~Elliot_Meyerson1

Submission Number: 1976

Loading