Role of Over-Parameterization in Generalization of 3-layer ReLU Networks

Published: 19 Mar 2024, Last Modified: 10 May 2024Tiny Papers @ ICLR 2024 PresentEveryoneRevisionsBibTeXCC BY 4.0
Keywords: rademacher complexity, generalization, overparameterization
TL;DR: We derive a rademacher complexity bound for 3-layer ReLU networks
Abstract: Over-parameterized neural networks defy conventional wisdom by generalizing effectively; however, standard complexity metrics like norms and margins fail to account for this. A recent work introduced a novel measure considering unit-wise capacities and provided a better explanation and tighter generalization bounds but was confined to two-layer networks. This paper extends that framework to three-layer ReLU networks. We empirically confirm the applicability of these measures and introduce a corresponding theoretical Rademacher complexity bound.
Submission Number: 169
Loading