Abstract: Exploring a substantial amount of unlabeled data, semi-supervised learning boosts the recognition performance when only a limited number of labels are provided. However, conventional methods assume a class-balanced data distribution, which is difficult to realize in practice due to the long-tailed nature of real-world data. While addressing the data imbalance is a well-explored area in supervised learning paradigms, directly transferring existing approaches to SSL is nontrivial, as prior knowledge about unlabeled data distribution remains unknown in SSL. In light of this, we introduce the Balanced Memory Bank (BMB), a framework for long-tailed semi-supervised learning. The core of BMB is an online-updated memory bank that caches historical features alongside their corresponding pseudo-labels, and the memory is also carefully maintained to ensure the data therein are class-rebalanced. Furthermore, an adaptive weighting module is incorporated to work jointly with the memory bank to further re-calibrate the biased training process. Experimental results across various datasets demonstrate the superior performance of BMB compared with state-of-the-art approaches. For instance, an improvement of 8.2% on the 1% labeled subset of ImageNet127 and 4.3% on the 50% labeled subset of ImageNet-LT.
External IDs:dblp:journals/tmm/PengWLWJ25
Loading