GradGCL: Gradient Graph Contrastive Learning

Published: 2024, Last Modified: 13 May 2025ICDE 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Graph self-supervised learning aiming to learn the graph representation without much label information is an important tasks in data mining and machine learning since labeled graph data is scarce and expensive to obtain in the real world. Contrastive learning emerges as a promising solution. However, we show existing graph contrastive learning (GCL) models have a significant issue: they generate representations that collapse into a low-dimensional subspace, resulting in a loss of information and diversity. We believe this issue arises from the strong assumption in current GCL methods that all positive samples should be close and all negative samples should be far in the representation space. From a data engineering view, this assumption fails to deeply mine the graph data and oversimplifies the complexity and heterogeneity of graph data, leading to clustered and redundant representations. To address this issue, we propose GradGCL, a novel method that leverages intrinsic gradient information as an additional input signal to regularize GCL training. The gradient information reflects the optimization process of the representations with respect to the contrastive loss, providing a complementary perspective to the representations. Furthermore, we have designed a soft separation strategy that relaxes the hard separation strategy between positive and negative samples, allowing for more flexibility and diversity in the representation space. We have conducted extensive experiments on various graph-related tasks, using different types of contrastive losses, datasets, and model architectures. We demonstrate that gradients alone can learn graph information and achieve competitive results with representation-based GCL methods. We also show that GradGCL can enhance existing GCL models and prevent the issue of dimensional collapse.
Loading