SDMG: Smoothing Your Diffusion Models for Powerful Graph Representation Learning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: SDMG bridges the gap between generative and representation learning by aligning low-frequency reconstruction, enhancing performance across graph tasks.
Abstract: Diffusion probabilistic models (DPMs) have recently demonstrated impressive generative capabilities. There is emerging evidence that their sample reconstruction ability can yield meaningful representations for recognition tasks. In this paper, we demonstrate that the objectives underlying generation and representation learning are not perfectly aligned. Through a spectral analysis, we find that minimizing the mean squared error (MSE) between the original graph and its reconstructed counterpart does not necessarily optimize representations for downstream tasks. Instead, focusing on reconstructing a small subset of features, specifically those capturing global information, proves to be more effective for learning powerful representations. Motivated by these insights, we propose a novel framework, the Smooth Diffusion Model for Graphs (SDMG), which introduces a multi-scale smoothing loss and low-frequency information encoders to promote the recovery of global, low-frequency details, while suppressing irrelevant high-frequency noise. Extensive experiments validate the effectiveness of our method, suggesting a promising direction for advancing diffusion models in graph representation learning.
Lay Summary: Graphs describe many things we care about, from social networks to protein molecules, yet teaching computers to read them usually needs lots of hand-labeled examples. A new family of “diffusion” AI models can learn without labels by first adding noise to a graph and then learning to undo it—but they try to rebuild every tiny detail, including random clutter, which paradoxically weakens the final predictions. We created SDMG (Smooth Diffusion Model for Graphs). This training recipe guides the model to focus on the broad, slow changing patterns in a graph, like spotting the outline of a coastline from an airplane rather than counting every pebble on the beach. Simple smoothing rules reward the recovery of these big-picture signals and allow the model to ignore distracting high-frequency noise. With this small change, SDMG produces sharper graph insight: on standard benchmarks it outperforms previous self-supervised methods at tasks like classifying research papers or identifying molecule types, all while using the same data. Better, label-efficient graph understanding could speed up drug discovery, improve fault detection in power grids, and help scientists make sense of complex climate networks—bringing the benefits of machine learning to areas where labeled data are scarce.
Primary Area: General Machine Learning->Representation Learning
Keywords: Graph Representation Learning, Graph Diffusion Model, Diffusion Probabilistic Models, Graph Self-supervised Learning
Submission Number: 10814
Loading