Sobolev acceleration for neural networks

Jong Kwon Oh; Hanbaek Lyu; Hwijae Son

Sobolev acceleration for neural networks

Jong Kwon Oh, Hanbaek Lyu, Hwijae Son

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Sobolev training, Gradient flow, Convergence acceleration, ReLU networks

TL;DR: We show that Sobolev training provably accelerates the convergence of Rectified Linear Unit (ReLU) networks and quantify such 'Sobolev acceleration'.

Abstract: $\textit{Sobolev training}$, which integrates target derivatives into the loss functions, has been shown to accelerate convergence and improve generalization compared to conventional $L^2$ training. However, the underlying mechanisms of this training method remain incompletely understood. In this work, we show that Sobolev training provably accelerates the convergence of Rectified Linear Unit (ReLU) networks and quantify such `Sobolev acceleration' within the student--teacher framework. Our analysis builds on an analytical formula for the population gradients and Hessians of ReLU networks under centered spherical Gaussian input. Extensive numerical experiments validate our theoretical findings and show that the benefits of Sobolev training extend to modern deep learning tasks, including diffusion models.

Supplementary Material: zip

Primary Area: learning theory

Submission Number: 23675

Loading