Diffusion Aligned Embeddings

ICLR 2026 Conference Submission23692 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: manifold learning, dimensionality reduction, large-scale data visualization, embeddings, representation learning
TL;DR: We propose DAE, a method that aligns random walks to create low-dimensional embeddings. DAE preserves both local and global structure and recovers meaningful patterns in benchmarks and single-cell RNA-seq.
Abstract: This paper introduced DAE, which formulates dimensionality reduction as aligning diffusion processes between high- and low-dimensional spaces. By minimizing the Path-KL divergence—which uniquely captures both transition probabilities and waiting times of continuous-time random walks—we proved formal bounds on generator and semigroup closeness, guaranteeing structure preservation across scales. Our optimization algorithm decomposes this objective into attraction-repulsion terms with an unbiased gradient estimator, enabling efficient parallel implementation. Experiments on single-cell RNA-seq datasets showed DAE consistently preserves both local neighborhoods and global structure, while our CUDA implementation scales to millions of cells with competitive runtime. The Path-KL framework provides theoretical guarantees that complement existing diffusion-based methods. DAE will be made available with CPU and GPU implementations.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 23692
Loading