Short optimization paths lead to good generalization

Fusheng Liu; Haizhao Yang; Qianxiao Li

Short optimization paths lead to good generalization

Fusheng Liu, Haizhao Yang, Qianxiao Li

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: optimization, generalization, machine learning theory

Abstract: Optimization and generalization are two essential aspects of machine learning. In this paper, we propose a framework to connect optimization with generalization by analyzing the generalization error based on the length of optimization trajectory under the gradient flow algorithm after convergence. Through our approach, we show that, with a proper initialization, gradient flow converges following a short path with an explicit length estimate. Such an estimate induces a length-based generalization bound, showing that short optimization paths after convergence indicate good generalization. Our framework can be applied to broad settings. For example, we use it to obtain generalization estimates on three distinct machine learning models: underdetermined $\ell_p$ linear regression, kernel regression, and overparameterized two-layer ReLU neural networks.

One-sentence Summary: We propose a framework to connect optimization and generalization and use it to obtain generalization estimates on three machine learning models.

Supplementary Material: zip

15 Replies

Loading