Limitations of Neural Collapse for Understanding Generalization in Deep Learning

TMLR Paper423 Authors

11 Sept 2022 (modified: 28 Feb 2023)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: The recent work of Papyan, Han, and Donoho (2020) presented an intriguing “Neural Collapse” phenomenon, showing a structural property of interpolating classifiers in the late stage of training. This opened a rich area of exploration studying this phenomenon. Our motivation is to study how far will understanding Neural Collapse take us in understanding deep learning? First, we investigate its role in generalization. We refine the Neural Collapse conjecture into two separate conjectures: collapse on the train set (an optimization property) and collapse on the test distribution (a generalization property). We find that while Neural Collapse often occurs on the train set, it does not occur on the test set. We thus conclude that Neural Collapse is primarily an optimization phenomenon, with as-yet-unclear connections to generalization. Second, we investigate the role of Neural Collapse in representation learning. We show simple, realistic experiments where more collapse leads to worse last-layer features, as measured by transfer-performance on a downstream task. This suggests that Neural Collapse is not always desirable for representation learning, as previously claimed. Our work thus clarifies the phenomenon of Neural Collapse, via more precise definitions that motivate controlled experiments.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Alexander_A_Alemi1
Submission Number: 423
Loading