Convergence of Mean-Field Langevin Stochastic Descent-Ascent for Distributional Minimax Optimization

Zhangyi Liu; Feng Liu; Rui Gao; Shuang Li

Convergence of Mean-Field Langevin Stochastic Descent-Ascent for Distributional Minimax Optimization

Zhangyi Liu, Feng Liu, Rui Gao, Shuang Li

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We derive a $\tilde{O}(1/\epsilon)$ convergence rate for dicrete-time mean-field Langevin stochastic gradient descent-ascent algorithm for solving distributional minimax optimization.

Abstract:

We study convergence properties of the discrete-time Mean-Field Langevin Stochastic Descent-Ascent (MFL-SDA) algorithm for solving distributional minimax optimization. These problems arise in various applications, such as zero-sum games, generative adversarial networks and distributionally robust learning. Despite the significance of MFL-SDA in these contexts, the discrete-time convergence rate remains underexplored. To address this gap, we establish a last-iterate convergence rate of $O(\frac{1}{\epsilon}\log\frac{1}{\epsilon})$ for MFL-SDA. This rate is nearly optimal when compared to the complexity lower bound of its Euclidean counterpart. This rate also matches the complexity of mean-field Langevin stochastic gradient descent for distributional minimization and the outer-loop iteration complexity of an existing double-loop algorithm for distributional minimax problems. By leveraging an elementary analysis framework that avoids PDE-based techniques, we overcome previous limitations and achieve a faster convergence rate.

Lay Summary:

The identification of mixed Nash equilibrium points in zero-sum games has long been a subject of significant research interest, primarily due to the inherent challenges associated with optimization in distributional spaces. A prevalent approach for analyzing optimization convergence in such spaces involves mean-field Langevin dynamics. Conventional methodologies typically first establish convergence analysis in continuous time via gradient flow techniques, followed by time discretization. In contrast, our approach directly conducts stepsize analysis. This methodological advancement enables our convergence analysis to achieve asymptotic optimality for the problem at hand. We also demonstrate that our framework enables convergence analysis for a wide range of prominent problems, including Generative Adversarial Networks (GANs) and zero-sum games, thereby illustrating its broad applicability.

Primary Area: Optimization

Keywords: mean-field Langevin dynamics, minimax optimization, Wasserstein gradient flow

Submission Number: 10539

Loading