MaxSup: Fixing Label Smoothing for Improved Feature Representation

Yuxuan Zhou; Heng Li; Zhi-Qi Cheng; Xudong Yan; Mario Fritz; Margret Keuper

MaxSup: Fixing Label Smoothing for Improved Feature Representation

Yuxuan Zhou, Heng Li, Zhi-Qi Cheng, Xudong Yan, Mario Fritz, Margret Keuper

24 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Label Smoothing, Regularization, Representation Learning, Explainability

Abstract: Label Smoothing aims to prevent Neural Networks from making over-confident predictions and improve generalization. Due to its effectiveness, it has become an indispensable ingredient of the training recipe for tasks such as Image Recognition and Neural Machine Translation. Despite that, previous work shows it encourages an overly tight cluster in the feature space, which `erases' the similarity information of individual examples, resulting in impaired representation learning. By isolating the loss induced by Label Smoothing into a combination of a regularization term and an error-enhancement term, we reveal a previously unknown defect, i.e., it indeed encourages classifiers to be over-confident, when they make incorrect predictions. To remedy this, we present a solution called Max Suppression (MaxSup), which consistently applies the intended regularization effect during training, independent of the correctness of prediction. By visualizing the learned features, we show that MaxSup successfully enlarges intra-class variations, while improving the inter-class separability. We further conduct experiments on Image Classification and Machine Translation tasks, validating the superiority of Max Suppression. The code implementation is available at [anonymous repository](https://anonymous.4open.science/r/Maximum-Suppression-Regularization-DB0C).

Primary Area: learning theory

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3323

Loading