Neural collapse versus low-rank bias: Is deep neural collapse really optimal?

Published: 16 Jun 2024, Last Modified: 16 Jun 2024HiLD at ICML 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: neural collapse, low-rank bias, deep neural collapse, unconstrained features model, deep unconstrained features model
TL;DR: We prove that the deep neural collapse is not an optimal solution of the deep unconstrained features model in multiclass classification due to a low rank bias.
Abstract: Deep neural networks (DNNs) exhibit a surprising structure in their final layer known as neural collapse (NC), and a growing body of works has investigated its propagation to earlier layers -- a phenomenon called deep neural collapse (DNC). However, existing theoretical results are restricted to special cases: linear models, only two layers or binary classification. In contrast, we focus on non-linear models of arbitrary depth in multi-class classification and reveal a surprising qualitative shift. As soon as we go beyond two layers or two classes, DNC is not optimal for the deep unconstrained features model (DUFM) -- the standard theoretical framework for the analysis of collapse. The main culprit is a low-rank bias of multi-layer regularization schemes, which leads to optimal solutions of even lower rank than neural collapse. Our theoretical findings are supported by experiments on both DUFM and real data.
Student Paper: Yes
Submission Number: 25
Loading