DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Qitian Wu; Chenxiao Yang; Wentao Zhao; Yixuan He; David Wipf; Junchi Yan

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Qitian Wu, Chenxiao Yang, Wentao Zhao, Yixuan He, David Wipf, Junchi Yan

Published: 01 Feb 2023, Last Modified: 22 Jun 2025ICLR 2023 notable top 25%Readers: Everyone

Keywords: structured representation learning, diffusion model, optimization-induced model, node prediction

Abstract: Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired instance representations. To this end, we introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states that progressively incorporate other instances' information by their interactions. The diffusion process is constrained by descent criteria w.r.t. a principled energy function that characterizes the global consistency of instance representations over latent structures. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs, which gives rise to a new class of neural encoders, dubbed as DIFFormer (diffusion-based Transformers), with two instantiations: a simple version with linear complexity for prohibitive instance numbers, and an advanced version for learning complex structures. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks, such as node classification on large graphs, semi-supervised image/text classification, and spatial-temporal dynamics prediction. The codes are available at https://github.com/qitianwu/DIFFormer.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

TL;DR: We introduce an energy constrained diffusion model for semi-supervised representation learning, based on which a new class of nerual encoders is derived for efficiently and effectively learning inter-instance latent graphs

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/difformer-scalable-transformers-induced-by/code)

9 Replies

Loading