Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels and the Multi-class Unhinged Loss Function

Alexandre Lemire Paquin; Brahim Chaib-draa; Philippe Giguère

Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels and the Multi-class Unhinged Loss Function

Alexandre Lemire Paquin, Brahim Chaib-draa, Philippe Giguère

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Noisy labels, Symmetric loss functions, Multi-class loss decomposition, Unhinged loss function

TL;DR: We investigate a principled symmetrization method for loss functions with some enphasis on a multi-class extension to the unhinged loss funciton.

Abstract: Labeling a training set is not only often expensive but also susceptible to errors. Consequently, the development of robust loss functions to label noise has emerged as a problem of great importance. The symmetry condition leads to theoretical guarantees for robustness to such noise. In this work, we investigate a symmetrization method that follows from the unique decomposition of any multi-class loss function into a sum of a symmetric loss function and a class-insensitive term. We describe how this approach is related to regularization from Dirichlet priors on the outputs of the network. Notably, the special case of the symmetrization of the cross-entropy loss leads to a multi-class extension to the unhinged loss function. This loss function is linear but contrary to the binary case, it must have specific coefficients in order to satisfy the symmetry condition. Under appropriate assumptions, we show that this multi-class unhinged loss function is the unique convex multi-class symmetric loss function. It plays an interesting role among multi-class symmetric loss functions since the linear approximation of any symmetric loss function around points with equal components must be equivalent to the multi-class unhinged loss function. Remarkably, even though the cross-entropy loss is not inherently robust, it also exhibits this property. This means that around points assigning equal probabilities to every class, the cross-entropy will be approximately symmetric. Experiments on CIFAR validate the robustness of our approach.

Primary Area: learning theory

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4378

Loading