Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel

Mahalakshmi Sabanayagam; Pascal Esser; Debarghya Ghoshdastidar

Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel

Mahalakshmi Sabanayagam, Pascal Esser, Debarghya Ghoshdastidar

Published: 13 Oct 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The fundamental principle of Graph Neural Networks (GNNs) is to exploit the structural information of the data by aggregating the neighboring nodes using a `graph convolution' in conjunction with a suitable choice for the network architecture, such as depth and activation functions. Therefore, understanding the influence of each of the design choice on the network performance is crucial. Convolutions based on graph Laplacian have emerged as the dominant choice with the symmetric normalization of the adjacency matrix as the most widely adopted one. However, some empirical studies show that row normalization of the adjacency matrix outperforms it in node classification. Despite the widespread use of GNNs, there is no rigorous theoretical study on the representation power of these convolutions, that could explain this behavior. Similarly, the empirical observation of the linear GNNs performance being on par with non-linear ReLU GNNs lacks rigorous theory. In this work, we theoretically analyze the influence of different aspects of the GNN architecture using the Graph Neural Tangent Kernel in a semi-supervised node classification setting. Under the population Degree Corrected Stochastic Block Model, we prove that: (i) linear networks capture the class information as good as ReLU networks; (ii) row normalization preserves the underlying class structure better than other convolutions; (iii) performance degrades with network depth due to over-smoothing, but the loss in class information is the slowest in row normalization; (iv) skip connections retain the class information even at infinite depth, thereby eliminating over-smoothing. We finally validate our theoretical findings numerically and on real datasets such as Cora and Citeseer.

Submission Length: Regular submission (no more than 12 pages of main content)

Supplementary Material: zip

Previous TMLR Submission Url: https://openreview.net/forum?id=ehPngdfeMc

Changes Since Last Submission: We thank all the reviewers and action editors for the constructive feedback and made the following changes to address the comments, * Improved the introduction of DC-SBM in Section 3. The discussion of the class separability of the kernel and the results are clarified and highlighted now in Sections 3 and 4 - 5, respectively. * In the case of heterophily, we have included the discussion of the analysis using a synthetic graph for vanilla GCN, Skip-PC and Skip-$\alpha$ in Figures 13 and 14 in the appendix. * Added the missing references in Section 8 under related works. * Added Figures 8, 12, 16 and 23 with discussions in the Appendix to address the comments from A2PN. * Clarified the experimental claims. * Added the references pointed out by the action editors in Section A of Appendix

Code: https://github.com/mahalakshmi-sabanayagam/NTK_GCN

Assigned Action Editor: ~Lechao_Xiao2

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 1363

Loading