Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness

Yuheng Zhao; Yu-Hu Yan; Kfir Yehuda Levy; Peng Zhao

Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness

Yuheng Zhao, Yu-Hu Yan, Kfir Yehuda Levy, Peng Zhao

Published: 18 Sept 2025, Last Modified: 21 Apr 2026NeurIPS 2025 spotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: online learning, regret, acceleration

Abstract: Smoothness is known to be crucial for acceleration in offline optimization, and for gradient-variation regret minimization in online learning. Interestingly, these two problems are actually closely connected --- accelerated optimization can be understood through the lens of gradient-variation online learning. In this paper, we investigate online learning with *Hölder* functions, a general class encompassing both smooth and non-smooth (Lipschitz) functions, and explore its implications for offline optimization. For (strongly) convex online functions, we design the corresponding gradient-variation online learning algorithm whose regret smoothly interpolates between the optimal guarantees in smooth and non-smooth regimes. Notably, our algorithms do not require prior knowledge of the Hölder smoothness parameter, exhibiting strong adaptivity over existing methods. Through online-to-batch conversion, this gradient-variation online adaptivity yields an optimal universal method for stochastic convex optimization under Hölder smoothness. However, achieving universality in offline strongly convex optimization is more challenging. We address this by integrating online adaptivity with a detection-based guess-and-check procedure, which, for the first time, yields a universal offline method that achieves accelerated convergence in the smooth regime while maintaining near-optimal convergence in the non-smooth one.

Primary Area: Optimization (e.g., convex and non-convex, stochastic, robust)

Submission Number: 18953

Loading