Optimization Benchmark for Diffusion Models on Dynamical Systems

Published: 31 Oct 2025, Last Modified: 28 Nov 2025EurIPS 2025 Workshop PriGMEveryoneRevisionsBibTeXCC BY 4.0
Keywords: diffusion, training dynamics, optimization
TL;DR: We study training behaviour and performance of various optimizers and hyperparameter choices for a diffusion model related to complex dynamical systems.
Abstract: In this work, we benchmark recent optimization algorithms for training a diffusion model for denoising flow trajectories. We observe that Muon and SOAP are highly efficient alternatives to AdamW (18% lower final loss). We also revisit several recent phenomena related to the training of models for text or image applications in the context of diffusion model training. This includes the impact of the learning-rate schedule on the training dynamics, and the performance gap between Adam and SGD.
Submission Number: 2
Loading