Keywords: GNNs, Oversquashing, Vanishing gradients, Spectral graph, Equivariance, Invariance, Transformer, Attention, Oversmoothing, MPNN
TL;DR: We show oversquashing occurs even in short range tasks via a two radius problem, revealing bottleneck effects are distinct from vanishing gradients and that virtual nodes are insufficient to address this
Abstract: Message Passing Neural Networks (MPNNs) are widely used for learning on graphs, but their ability to process long-range information is limited by the phenomenon of _oversquashing_. This limitation has led some researchers to advocate graph transformers as a better alternative, whereas others suggest that it can be mitigated within the MPNN framework, using virtual nodes or other rewiring techniques.
In this work, we demonstrate that oversquashing is not limited to long-range tasks, but can also arise in short-range problems. This observation allows us to disentangle two distinct mechanisms underlying oversquashing: (1) the _bottleneck phenomenon_, which can arise even in low-range settings, and (2) the _vanishing gradient phenomenon_, which is closely associated with long-range tasks.
We further show that the short-range bottleneck effect is not captured by existing explanations for oversquashing, and that adding virtual nodes does not resolve it. In contrast, transformers do succeed in such tasks, positioning them as the more compelling solution to oversquashing, compared to specialized MPNNs.
Submission Type: Full paper proceedings track submission (max 9 main pages).
Submission Number: 51
Loading