Keywords: Deep learning, optimization, smoothness
TL;DR: We empirically study the $(L_0,L_1)$-smoothness condition in the setting of feedforward networks with either $L2$ or cross-entropy loss; the experiments suggest that the $(L_0,L_1)$-smoothness condition does not hold in this setting.
Abstract: The $(L_0,L_1)$-smoothness condition was introduced by Zhang-He-Sra-Jadbabai in 2020, who both proved convergence bounds under this assumption and provided empirical evidence it is satisfied in deep learning. Since then, many groups have proven convergence guarantees for functions which satisfy this condition, motivated by the expectation that loss functions arising in deep learning satisfy it. In this paper we provide further empirical study of this condition in the setting of feedforward neural networks of depth at least 2, with $L2$ or cross entropy loss. The results suggest that the $(L_0,L_1)$-smoothness condition is not satisfied in this setting.
Is Neurips Submission: No
Submission Number: 85
Loading