Memory Disagreement: A Pseudo-Labeling Measure from Training Dynamics for Semi-supervised Graph Learning

Hongbin Pei; Yuheng Xiong; Pinghui Wang; Jing Tao; Jialun Liu; Huiqi Deng; Jie Ma; Xiaohong Guan

Memory Disagreement: A Pseudo-Labeling Measure from Training Dynamics for Semi-supervised Graph Learning

Hongbin Pei, Yuheng Xiong, Pinghui Wang, Jing Tao, Jialun Liu, Huiqi Deng, Jie Ma, Xiaohong Guan

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24EveryoneRevisionsBibTeX

Keywords: Graph Neural Networks, Epistemic Uncertainty, Self-Training

TL;DR: We uncover that training dynamics contain valuable information regarding prediction uncertainty in graph learning; accordingly, we propose a novel uncertainty measure for pseudo-labeling, termed memory disagreement (MoDis for short).

Abstract: In the realm of semi-supervised graph learning, pseudo-labeling is a pivotal strategy to utilize both labeled and unlabeled nodes for model training. Currently, confidence score is the most frequently used pseudo-labeling measure, however, it suffers from poor calibration and issues in out-of-distribution data. In this paper, we propose memory disagreement (MoDis for short), a novel uncertainty measure for pseudo-labeling. We uncover that training dynamics offer significant insights into prediction uncertainty --- *if a graph model makes consistent predictions for an unlabeled node throughout training, the corresponding predicted label is likely to be correct. Thus, the node should be suitable for pseudo-labeling*. We implement MoDis as the entropy of an accumulated distribution that summarizes the disagreement of the model's predictions throughout training. We further enhance and analyze MoDis in case studies, which show nodes with low MoDis are suitable for pseudo-labeling as these nodes tend to be distant from boundaries in both graph and representation space. We design MoDis based pseudo-label selection algorithm and corresponding pseudo-labeling algorithm, which are applicable to various graph neural networks. We empirically validate MoDis on eight benchmark graph datasets. The experimental results show that pseudo labels given by MoDis have better quality in correctness and information gain, and the algorithm benefits various graph neural networks, achieving an average improvement of 3.11\% and reaching up to 30.24\% when compared to the wildly-used uncertainty measure, confidence score. Moreover, we demonstrate the efficacy of MoDis on out-of-distribution nodes. All code will be released after reviewing, according to the conference policy.

Track: Graph Algorithms and Learning for the Web

Submission Guidelines Scope: Yes

Submission Guidelines Blind: Yes

Submission Guidelines Format: Yes

Submission Guidelines Limit: Yes

Submission Guidelines Authorship: Yes

Student Author: No

Submission Number: 479

Loading