Are all layers created equal: A neural collapse perspective

Jinxin Zhou; Jiachen Jiang; Zhihui Zhu

Are all layers created equal: A neural collapse perspective

Jinxin Zhou, Jiachen Jiang, Zhihui Zhu

Published: 11 Feb 2025, Last Modified: 06 Mar 2025CPAL 2025 (Proceedings Track) PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Learning, Neural Collapse, Robustness, Generalization, Memorization, Understanding

TL;DR: An extensive investigation of intermediate representations from the perspective of neural collapse, offering insights into the roles of layers in generalization, memorization, and robustness.

Abstract: Understanding how features evolve layer by layer is crucial for uncovering the inner workings of deep neural networks. \textit{Progressive neural collapse}, where successive layers increasingly compress within-class features and enhance class separation, has been primarily studied empirically in small architectures on simple tasks or theoretically within linear network contexts. However, its behavior in larger architectures and complex datasets remains underexplored. In this work, we extend the study of progressive neural collapse to larger models and more complex datasets, including clean and noisy data settings, offering a comprehensive understanding of its role in generalization and robustness. Our findings reveal three key insights:1. Layer inequality: Deeper layers significantly enhance neural collapse and play a vital role in generalization but are also more susceptible to memorization. 2. Depth-dependent behavior: In deeper models, middle layers contribute minimally due to a diminished neural collapse enhancement leading to redundancy and limited generalization improvements, which validates the effectiveness of layer pruning. 3. Architectural differences: Transformer models outperform convolutional models in enhancing neural collapse on larger datasets and exhibit greater robustness to memorization, with deeper Transformers reducing memorization while deeper convolutional models show the opposite trend. These findings provide new insights into the hierarchical roles of layers and their interplay with architectural design, shedding light on how deep neural networks process data and generalize across challenging conditions.

Submission Number: 83

Loading