Fixed Non-negative Orthogonal Classifier: Inducing Zero-mean Neural Collapse with Feature Dimension Separation

Published: 16 Jan 2024, Last Modified: 19 Apr 2024ICLR 2024 posterEveryoneRevisionsBibTeX
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: neural collapse, orthogonal, fixed classifier, imbalanced, continual
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Fixed classifiers in neural networks for classification problems have demonstrated cost efficiency and even outperformed learnable classifiers in some popular benchmarks when incorporating orthogonality. Despite these advantages, prior research has yet to investigate the training dynamics of fixed orthogonal classifiers on neural collapse, a recently clarified phenomenon that last-layer features converge to a specific form, called simplex ETF, in training classification models involving the post-zero-error phase. Ensuring this phenomenon is critical for obtaining global optimality in a layer-peeled model, potentially leading to enhanced performance in practice. However, fixed orthogonal classifiers cannot invoke neural collapse due to their geometric limitations. To overcome the limits, we analyze a $\textit{zero-mean neural collapse}$ considering the orthogonality in non-negative Euclidean space. Then, we propose a $\textit{fixed non-negative orthogonal classifier}$ that induces the optimal solution and maximizes the margin of an orthogonal layer-peeled model by satisfying the properties of zero-mean neural collapse. Building on this foundation, we exploit a $\textit{feature dimension separation}$ effect inherent in our classifier for further purposes: (1) enhances softmax masking by mitigating feature interference in continual learning and (2) tackles the limitations of mixup on the hypersphere in imbalanced learning. We conducted comprehensive experiments on various datasets and architectures and demonstrated significant performance improvements.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 4777
Loading