A theoretical study of the $(L_0,L_1)$-smoothness condition in deep learning

Published: 10 Oct 2024, Last Modified: 07 Dec 2024NeurIPS 2024 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Deep learning, optimization, smoothness
TL;DR: In this paper we prove that for feedforward networks with two or more hidden layers, with either $L2$ or cross-entropy loss, the $(L_0,L_1)$-smoothness condition is not satisfied.
Abstract: We study the $(L_0,L_1)$-smoothness condition introduced by Zhang-He-Sra-Jadbabai in 2020 in the setting of loss functions arising in deep learning. Theoretical work on $(L_0,L_1)$-smoothnes has focused on convergence guarantees for functions which satisfy this condition. In this paper we provide theoretical analysis of the condition in the setting of feedforward neural networks of depth at least 2, with either $L2$ or cross-entropy loss, and find the $(L_0,L_1)$-smoothness condition is not satisfied.
Submission Number: 121
Loading