GraphDenoiser: An Unsupervised Iterative Framework for Node Label Denoising in Graph-Structured Data
Keywords: Graph, Unsupervised Learning, Node Label Denoising, Iterative Optimization
Abstract: Data annotation errors have always been one of the core challenges in the field of supervised learning: such noise not only interferes with the model's effective learning of the patterns of data distribution, but also directly leads to the model's discriminative bias in target tasks, significantly reducing the predictive accuracy of supervised learning systems. For graph-structured data, due to the uniqueness of the associative relationships between data points, traditional denoising methods struggle to adapt to the noise distribution patterns in this scenario. To address this issue, this paper focuses on the denoising problem of node type labels in graph data and proposes a denoising framework based on unsupervised learning called GraphDenoiser. Through multiple rounds of iteration between the node label noise prediction model and the synthetic data generation model, this framework can quantitatively output the noise probability of each node label and accurately locate mislabeled nodes in graph data. This provides a reliable noise diagnosis basis for subsequent label correction and robust graph model training, thereby alleviating the constraints of node mislabeling in graph data on supervised learning performance. Experiments show that under eight different noise injection methods across three datasets, compared with previous methods, the metrics of MCC, and F1 have increased by 22.81%, and 30.51% respectively.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 3770
Loading