A Proximal Stochastic Gradient Method for Doubly-regularized Spectral Risk Minimization

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Spectral risk minimization, distribution shift regularization, non-differentiable regularization, proximal stochastic gradient
Abstract: Spectral risk minimization (SRM) is an important category of distributionally robust optimization. Recent works in this field elaborate on either distribution shift regularization (DSR) on the spectrum or non-differentiable regularization (NDR) on the parameters. However, few methods can simultaneously handle double regularization. The main difficulty lies in suppressing the bias and variance of the stochastic gradient when double regularization is present. To solve this problem, we develop a novel proximal stochastic gradient method (PSG-SRM) that simultaneously handles double regularization, reduces bias and variance along with iterations, and achieves linear convergence. It has lower computational complexity than two state-of-the-art methods that handle DSR or NDR separately. Experimental results indicate that it achieves competitive performance in both regression and classification tasks, and shows stable performance with respect to randomness.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 12109
Loading