Keywords: Generalized Smoothness, Stochastic Convex Optimization, First- and Zero-Order Algorithms
Abstract: This paper is devoted to the study of stochastic optimization problems under the generalized smoothness assumption. By considering the unbiased gradient oracle in _Stochastic Gradient Descent_, we provide strategies to achieve in bounds the summands with exponential objective decrease. In particular, in the case $L_0 = 0$, we obtain in the **convex setup** the iteration complexity: $N = \mathcal{O} \left( L_1R \log\frac{1}{\varepsilon} + \frac{L_1 c R^2}{\varepsilon}\right)$ for _Clipped Stochastic Gradient Descent_ and $N = \mathcal{O} \left(L_1R \log\frac{1}{\varepsilon}\right)$ for _Normalized Stochastic Gradient Descent_. Furthermore, we generalize the convergence results to the case with a biased gradient oracle, and show that the power of $(L_0,L_1)$-smoothness extends to _zero-order algorithms_. Finally, we demonstrate the possibility of the zero-order algorithm outperforming the first-order algorithm in the convex setup through numerical experimentation, which has aroused some interest in the machine learning community -- logistic regression problem.
Supplementary Material: pdf
Primary Area: optimization
Submission Number: 8930
Loading