BCOS: A Method for Stochastic Approximation

Tao Jiang; Lin Xiao

BCOS: A Method for Stochastic Approximation

Tao Jiang, Lin Xiao

10 May 2025 (modified: 29 Oct 2025)Submitted to NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: stochastic approximation, deep-learning optimizer, block coordinate update, adaptive stepsize, almost-sure convergence

Abstract: We consider stochastic approximation with block-coordinate stepsizes and propose adaptive stepsize rules that aim to minimize the expected distance of the next iterate from an optimal point. These stepsize rules use online estimates of the second moment of the search direction along each block coordinate, and the popular Adam algorithm can be interpreted as using a particular heuristic for such estimation. By leveraging a simple conditional estimator, we derive variants of BCOS that obtain competitive performance but require fewer optimizer states and hyper-parameters. In addition, our convergence analysis relies on a simple aiming condition that assumes neither convexity nor smoothness, thus has broad applicability.

Primary Area: Optimization (e.g., convex and non-convex, stochastic, robust)

Submission Number: 13962

Loading