Calibrating Graph Neural Networks from a Data-centric Perspective

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24EveryoneRevisionsBibTeX
Keywords: Graph Neural Network, Calibration, Data-centric Learning
TL;DR: Calibrating graph neural networks by modifying input graph instead of temperature scaling.
Abstract: Graph neural networks (GNNs) have gained popularity in modeling various complex networks, e.g., social network and webpage network. Despite the promising accuracy, the confidences of GNNs are shown to be miscalibrated, indicating limited awareness of prediction uncertainty and harming the reliability of model decisions. Existing calibration methods primarily focus on improving GNN models, e.g., adding regularization during training or introducing temperature scaling after training. In this paper, we argue that the miscalibration of GNNs may stem from the graph data and can be alleviated through topology modification. To support this motivation, we conduct data observations by examining the impacts of decisive and homophilic edges on calibration performance, where decisive edges play a critical role in GNN predictions and homophilic edges connect nodes of the same class. By assigning larger weights to these edges in the adjacency matrix, we observe an improvement in calibration performance without sacrificing classification accuracy. This suggests the potential of a data-centric approach for calibrating GNNs. Motivated by our observations, we propose Data-centric Graph Calibration (DCGC), which uses two edge weighting modules to adjust the input graph for GNN calibration. The first module learns the weights of decisive edges by parameterizing the adjacency matrix and enabling backpropagation of the prediction loss to edge weights. This emphasizes critical edges that fit the prediction needs. The second module computes weights for homophilic edges based on predicted label distributions, assigning larger weights to edges with stronger homophily. These modifications operate at the data level and can be easily integrated with temperature scaling-based methods for better calibration. Experimental results on 8 benchmark datasets demonstrate that DCGC achieves state-of-the-art calibration performance, with an average relative improvement of 36.4% in ECE, while maintaining or even slightly improving classification accuracy. Ablation studies and hyper-parameter analysis further validate the effectiveness and robustness of our proposed method DCGC.
Track: Graph Algorithms and Learning for the Web
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: No
Submission Number: 1452
Loading