A Note On The Stability Of The Focal Loss

TMLR Paper5166 Authors

20 Jun 2025 (modified: 08 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The Focal loss is a widely deployed loss function that is used to train various types of deep learning models. This loss function is a modification of the cross-entropy loss designed to mitigate the effect of class imbalance in dense object detection tasks by downweighing easy, well-classified examples. In doing so, more focus is placed on hard, wrongly-classified examples by preventing the gradients from being dominated by examples from which the model can easily predict the correct class. This downweighing is achieved by scaling the cross-entropy loss with a term that depends on a focusing parameter $\gamma$. In this paper, we highlight an unaddressed instability of the Focal loss that arises when this focusing parameter is set to a value between 0 and 1. We present the theoretical foundation behind this instability, show that it is numerically identifiable, and demonstrate it in a binary classification and segmentation task on the MNIST dataset. Additionally, we propose a straightforward modification to the original Focal loss to ensure stability whenever these unstable focusing parameter values are used.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Venkatesh_Babu_Radhakrishnan2
Submission Number: 5166
Loading