Abstract: Graph Neural Networks (Gnns) have shown remarkable performance in semi-supervised node classification, but their effectiveness diminishes in settings with extremely limited labeled data. The scarcity of labeled nodes leads to an under-reaching issue, where unlabeled nodes receive insufficient supervision, resulting in poor generalization. In this paper, we propose G-NodeMixup, a generalized extension of our previously proposed NodeMixup framework which was designed to address under-reaching by improving communication between labeled and unlabeled nodes. G-NodeMixup introduces three novel components: (1) Multi-set Pairing, which facilitates mixup between Labeled-Labeled, Labeled-Unlabeled, and Unlabeled-Unlabeled nodes to enhance node interactions and promote smoother decision boundaries; (2) Subgraph-based Mixup, which focuses mixup within k-hop subgraphs to preserve graph locality and avoid disruptive global edge modifications; and (3) Consistency Regularization-based Mixup Loss, which reduces reliance on noisy pseudo-labels by enforcing consistency between mixed node predictions. Our framework remains architecture-agnostic and can be applied to various Gnn models without requiring significant architectural changes or excessive computational overhead. Experimental results across several benchmark datasets demonstrate that G-NodeMixup consistently improves Gnn performance in extremely limited labeled settings, achieving state-of-the-art results and establishing its practical effectiveness.
Loading