Linking Neural Collapse and L2 Normalization with Improved Out-of-Distribution Detection in Deep Neural Networks

28 Sept 2022, 22:10 (modified: 27 Jan 2023, 19:42)Accepted by TMLREveryoneRevisionsBibTeX
Abstract: We propose a simple modification to standard ResNet architectures--L2 normalization over feature space--that substantially improves out-of-distribution (OoD) performance on the previously proposed Deep Deterministic Uncertainty (DDU) benchmark. We show that this change also induces early Neural Collapse (NC), an effect linked to better OoD performance. Our method achieves comparable or superior OoD detection scores and classification accuracy in a small fraction of the training time of the benchmark. Additionally, it substantially improves worst case OoD performance over multiple, randomly initialized models. Though we do not suggest that NC is the sole mechanism or a comprehensive explanation for OoD behaviour in deep neural networks (DNN), we believe NC's simple mathematical and geometric structure can provide a framework for analysis of this complex phenomenon in future work.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: - changed title so its clear we're not claiming causality of NC->OoD - softened claim 4 and references to it in paper so we're not implying causality of NC->OoD - added experiments overtraining L2 norm models to 350 epochs, for direct comparison to no L2 baselines - added experiments supporting motivations for fitting GMMs to feature space - included experiments with softmax scores as an AUROC scoring rule to see effect of L2 norm on softmax when used for this purpose - fixed typos - added latex reference links to tables, figures and sections
Assigned Action Editor: ~Neil_Houlsby1
Submission Number: 465