Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks

Roey Magen; Ohad Shamir

Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks

Roey Magen, Ohad Shamir

Published: 21 Sept 2023, Last Modified: 19 Jan 2024NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: sample complexity; learning theory; neural networks; linear predictors

TL;DR: We provide new results on the sample complexity of vector-valued linear predictors, and more generally neural networks, showing that it can be surprisingly different than the well-studied setting of scalar-valued linear predictors

Abstract: We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix $W_0$ is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.

Supplementary Material: pdf

Submission Number: 1892

Loading