Limitations of Neural Collapse for Understanding Generalization in Deep Learning

Limitations of Neural Collapse for Understanding Generalization in Deep Learning

TMLR Paper763 Authors

10 Jan 2023 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The recent work of Papyan, Han, and Donoho (2020) presented an intriguing “Neural Collapse” phenomenon, showing a structural property of interpolating classifiers in the late stage of training. This opened a rich area of exploration studying this phenomenon. Our motivation is to study how far understanding Neural Collapse can take us in understanding deep learning. First, we investigate its role in generalization. We refine the Neural Collapse conjecture into two separate conjectures: collapse on the train set (an optimization property) and collapse on the test distribution (a generalization property). We find that while Neural Collapse often occurs on the train set, it does not occur on the test set. We thus conclude that Neural Collapse is primarily an optimization phenomenon, with as-yet-unclear connections to generalization. Second, we investigate the role of Neural Collapse in representation learning. We show simple, realistic experiments where more collapse leads to worse last-layer features, as measured by transfer-performance on a downstream task. This suggests that Neural Collapse is not always desirable for representation learning, as previously claimed. Our work thus clarifies the phenomenon of Neural Collapse, via more precise definitions that motivate controlled experiments.

Submission Length: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=FFqMVz0oQm

Changes Since Last Submission: We believe there was some misunderstanding by the reviewers: the claims made in Section 4.1 were theoretical claims, not experimental ones – and this is why we didn't have corresponding experiments. We've clarified this by explicitly stating the (simple) theoretical claim as a "Lemma", and provided proof. We believe that with this change, we've addressed all of the relevant reviewer feedback. In particular, we believe all claims are technically correct and substantiated with either proofs or experiments.

Assigned Action Editor: ~Alexander_A_Alemi1

Submission Number: 763

Loading