Investigating Out-of-Distribution Generalization of GNNs: An Architecture Perspective
Abstract: Graph neural networks (GNNs) have exhibited remarkable performance under the assumption that test data comes from the same
distribution of training data. However, in real-world scenarios, this
assumption may not always be valid. Consequently, there is a growing focus on exploring the Out-of-Distribution (OOD) problem in
the context of graphs. Most existing efforts have primarily concentrated on improving graph OOD generalization from two modelagnostic perspectives: data-driven methods and strategy-based
learning. However, there has been limited attention dedicated to
investigating the impact of well-known GNN model architectures on graph OOD generalization, which is orthogonal to existing research. In this work, we provide the first comprehensive
investigation of OOD generalization on graphs from an architecture
perspective, by examining the common building blocks of modern
GNNs. Through extensive experiments, we reveal that both the
graph self-attention mechanism and the decoupled architecture
contribute positively to graph OOD generalization. In contrast, we
observe that the linear classification layer tends to compromise
graph OOD generalization capability. Furthermore, we provide
in-depth theoretical insights and discussions to underpin these discoveries. These insights have empowered us to develop a novel
GNN backbone model, DGat, designed to harness the robust properties of both graph self-attention mechanism and the decoupled
architecture. Extensive experimental results demonstrate the effectiveness of our model under graph OOD, exhibiting substantial and
consistent enhancements across various training strategies
Loading