An effective two-stage training scheme for boundary decision of imbalanced samples

Qi Xue, Shaojie Qiao, Guoping Yang, Hai Liao, Nan Han, Yuhan Peng, Tao Wu, Guan Yuan, He Li

Published: 2025, Last Modified: 23 Jan 2026Int. J. Mach. Learn. Cybern. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: How to categorize imbalanced data is an active research direction in data mining and machine learning research areas. In order to dynamically reduce the negative influence of imbalanced samples on the loss in the phase of training, the existing cost-sensitive re-weighing methods assign different weights to imbalanced samples. However, it will be overfitted for deep neural networks (DNNs) to process hard samples, because the existing cost-sensitive re-weighting methods cannot effectively guide DNNs to reasonably partition the decision boundaries of the samples. In this study, we propose a new self-balanced loss function, called SBLoss, which can adaptively assign different weights to the samples according to the influence of the samples on the decision boundary in order to reduce the overfitting phenomena caused by hard samples. Extensive experiments are conducted on multiple real imbalanced datasets, and the experimental results show that the imbalanced data classification method based on a two-stage training scheme have high accuracy and robustness, which outperforms the state-of-the-art methods.