Neural Collapse: A Review on Modelling Principles and Generalization
Abstract: Deep classifier neural networks enter the terminal phase of training (TPT) when training error reaches zero and tend to exhibit intriguing Neural Collapse (NC) properties. Neural collapse essentially represents a state at which the within-class variability of final hidden layer outputs is infinitesimally small and their class means form a simplex equiangular tight frame. This simplifies the last layer behaviour to that of a nearest-class center decision rule. Despite the simplicity of this state, the dynamics and implications of reaching it are yet to be fully understood. In this work, we review the principles which aid in modelling neural collapse, followed by the implications of this state on generalization and transfer learning capabilities of neural networks. Finally, we conclude by discussing potential avenues and directions for future research.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Camera-ready submission with changes incorporated from reviewer discussions.
Assigned Action Editor: ~Jeffrey_Pennington1
Submission Number: 758