Learning Visual Structures with Adaptive Hyperedge Propagation

Luyao Tang, Zheyuan Cai

Published: 29 Nov 2022, Last Modified: 05 May 2026OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Pairwise token affinity has become a common primitive in modern visual recognition models, from self-attention layers to graph message passing networks. However, many visual concepts are better described as correlations among sets of regions, such as object parts, repeated textures, or spatially distributed contextual cues. We propose an adaptive hyperedge propagation framework for general-purpose visual understanding. The framework constructs a multi-order hypergraph over image tokens by assigning patches to a small set of learned semantic anchors, producing scalable hyperedges that encode both appearance similarity and structural layout. To update features on this induced topology, we introduce a two-way propagation module that aggregates vertex features into hyperedge states and redistributes high-order contextual information back to vertices. This design avoids dense all-pairs computation while preserving correlations that cannot be faithfully captured by standard graph layers. We instantiate the method in both isotropic and hierarchical backbones and evaluate it on large-scale image classification. Results demonstrate that explicit high-order correlation modeling yields competitive or superior accuracy compared with Transformer and graph-based baselines, while substantially reducing model size and computational cost.