Abstract: Existing semi-supervised learning (SSL) algorithms typically assume classbalanced datasets, although the class distributions of many real-world datasets
are imbalanced. In general, classifiers trained on a class-imbalanced dataset are
biased toward the majority classes. This issue becomes more problematic for SSL
algorithms because they utilize the biased prediction of unlabeled data for training.
However, traditional class-imbalanced learning techniques, which are designed for
labeled data, cannot be readily combined with SSL algorithms. We propose a scalable class-imbalanced SSL algorithm that can effectively use unlabeled data, while
mitigating class imbalance by introducing an auxiliary balanced classifier (ABC)
of a single layer, which is attached to a representation layer of an existing SSL algorithm. The ABC is trained with a class-balanced loss of a minibatch, while using
high-quality representations learned from all data points in the minibatch using the
backbone SSL algorithm to avoid overfitting and information loss. Moreover, we
use consistency regularization, a recent SSL technique for utilizing unlabeled data
in a modified way, to train the ABC to be balanced among the classes by selecting
unlabeled data with the same probability for each class. The proposed algorithm
achieves state-of-the-art performance in various class-imbalanced SSL experiments
using four benchmark datasets.
0 Replies
Loading