A Note On The Stability Of The Focal Loss

Published: 03 Nov 2025, Last Modified: 03 Nov 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The Focal Loss is a widely deployed loss function that is used to train various types of deep learning models. It is a modification of the cross-entropy loss designed to mitigate the effect of class imbalance in dense object detection tasks. By downweighting the losses for easy, correctly classified samples, the method places more emphasis on harder, misclassified ones. As a result, gradient updates are not dominated by samples that the model already handles correctly. The downweighting of the loss is achieved by scaling the cross-entropy loss with a term that depends on a focusing parameter $\gamma$. In this paper, we highlight an unaddressed numerical instability of the Focal Loss that arises when this focusing parameter is set to a value between 0 and 1. We present the theoretical basis of this numerical instability, show that it can be detected in the computation of Focal Loss gradients, and demonstrate its effects across several classification and segmentation tasks. Additionally, we propose a straightforward modification to the original Focal Loss to ensure stability whenever these unstable focusing parameter values are used.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: 1/ We have added a paragraph to the introduction to highlight the practical importance of solving the numerical instability. 2/ We have emphasised the importance of our solution in the final part of the introduction. 3/ We have also made some minor textual modifications to improve readability. 4/ We have shortened the code that was added to the Appendix and instead moved it to a GitHub repository.
Code: https://github.com/MartijnPeterVanLeeuwen/Focal-Loss-Instability/tree/main
Assigned Action Editor: ~Venkatesh_Babu_Radhakrishnan2
Submission Number: 5166
Loading