everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
In the realm of machine learning, conventional techniques like neural networks often encounter challenges when dealing with imbalanced data. Unfortunately, imbalanced data is a common occurrence in real-world datasets, where collection methods may fail to capture sufficient data within specific target variable ranges. Additionally, certain tasks inherently involve imbalanced data, where the occurrences of normal events significantly outweigh those of edge cases. While the problem of imbalanced data has been extensively studied in the context of classification, only a limited number of methods have been proposed for regression tasks. Furthermore, the existing methods often yield suboptimal performance when applied to high-dimensional data, and the domain of imbalanced high-dimensional regression remains relatively unexplored. In response to the identified challenge, this paper presents SwitchLoss, a novel optimization scheme for neural networks, and SwitchLossR, a variant with a restricted search space. Diverging from conventional approaches, SwitchLoss and SwitchLossR integrate variable loss functions into the traditional training process. Our assessment of these methods spans 15 regression datasets across diverse imbalanced domains, 5 synthetic high-dimensional imbalanced datasets, and two imbalanced age estimation image datasets. Findings from our investigation demonstrate that the combined utilization of SwitchLoss and SwitchLossR not only leads to a notable reduction in validation error, but also surpasses prevailing state-of-the-art techniques dedicated to imbalanced regression.