Progressive low-confidence pseudolabeling for semisupervised node classification

Tao Zhu, Hua Mao, Hui Liu, Jie Chen

Published: 01 Dec 2025, Last Modified: 11 Nov 2025NeurocomputingEveryoneRevisionsCC BY-SA 4.0
Abstract: Graph neural networks (GNNs) have demonstrated remarkable achievements in handling graph-structured data. However, the performance of GNNs is typically limited by the lack of sufficient labeled data, which are time-consuming to obtain in real-world scenarios. Pseudolabeling has been applied to GNNs by augmenting the training set data with unlabeled data. Most pseudolabeling methods on graphs assign pseudolabels to nodes based on high-confidence thresholds. However, nodes near labeled ones generally obtain high confidence scores during training. This results in an increasing number of similar nodes being assigned pseudolabels during training, which potentially leads to a distribution shift between the labeled dataset and the augmented dataset. The distribution of the augmented dataset diverges significantly from that of the entire graph data, causing the GNNs to perform poorly on test data. In this paper, we propose a progressive low-confidence pseudolabeling (PLCP) method to progressively leverage the low-confidence data. Specifically, pseudolabels are assigned to nodes within a predefined confidence-based ranking range. To alleviate distribution shift, we keep this range constant throughout the training process to prevent excessive nodes from being assigned pseudolabels. The range is designed to be sufficiently wide to leverage low-confidence nodes. Low-confidence nodes from the range propagate information to their neighbors, which helps the model capture patterns in uncertain regions. To alleviate the impact of noisy pseudolabels, a validation-based reassignment scheme is proposed to utilize validation metrics to assign more reliable pseudolabels. Numerous experiments are conducted to demonstrate that our proposed PLCP improves the performance of state-of-the-art GNNs on graph datasets in comparison with several established methods.
Loading