Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0
TL;DR: We introduce a highly scalable algorithm for learning to sample from only energy functions, the first of its kind in terms of efficiency, which we scale to new benchmarks on conformer generation.
Abstract: We introduce Adjoint Sampling, a highly scalable and efficient algorithm for learning diffusion processes that sample from unnormalized densities, or energy functions. It is the first on-policy approach that allows significantly more gradient updates than the number of energy evaluations and model samples, allowing us to scale to much larger problem settings than previously explored by similar methods. Our framework is theoretically grounded in stochastic optimal control and shares the same theoretical guarantees as Adjoint Matching, being able to train without the need for corrective measures that push samples towards the target distribution. We show how to incorporate key symmetries, as well as periodic boundary conditions, for modeling molecules in both cartesian and torsional coordinates. We demonstrate the effectiveness of our approach through extensive experiments on classical energy functions, and further scale up to neural network-based energy models where we perform amortized conformer generation across many molecular systems. To encourage further research in developing highly scalable sampling methods, we plan to open source these challenging benchmarks, where successful methods can directly impact progress in computational chemistry. Code \& and benchmarks provided at https://github.com/facebookresearch/adjoint_sampling.
Lay Summary: Many machine learning problems involve sampling from distributions that are defined by an energy function, where lower energy corresponds to higher probability. In molecular modeling, for example, stable molecular structures correspond to low-energy stable configurations, so sampling from these distributions helps predict likely structures. Training samplers for these problems is often expensive, since each update usually requires generating new samples and evaluating the energy function which may be costly. We introduce Adjoint Sampling, a method that makes training much more efficient by reusing model samples across many updates. By looking backward through a related idealized process, the method extracts more learning signal from each sample, reducing the number of costly energy evaluations while still converging to the correct solution. This leads to high-quality samplers that are significantly cheaper to train. We show strong results on molecular structure generation across a wide range of molecule types, along with benchmarks to support future work on efficient, scalable sampling.
Link To Code: https://github.com/facebookresearch/adjoint_sampling
Primary Area: Probabilistic Methods->Monte Carlo and Sampling Methods
Keywords: Sampling, Stochastic Optimal Control
Submission Number: 14121
Loading