Constructing Balanced Training Samples: A New Perspective on Long-Tailed Classification

Published: 01 Jan 2025, Last Modified: 14 Nov 2025IEEE Trans. Multim. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The most significant characteristic of long-tailed classification is that severe sample imbalance causes the model to be biased towards the head category. While the long-tailed distribution of multimedia dataset remains a constant, we can enhance the acquisition of balanced training samples and corresponding features during the learning process. This paper innovatively designs a sample provider to construct balanced training samples to enhance the acquisition of comprehensive features, and proposes a Siamese-based parameter-sharing framework to handle data with long-tailed distributions. Specifically, one branch of the Siamese network is introduced to classify samples with conventional random cropping sampling, another branch integrates the advantages of constructed balanced samples and hybrid optimization to capture the balanced features to identify more precise category boundaries. This combination not only facilitates the learning of long-tailed distribution but also strengthens the model's extraction of balanced features through the incorporation of contrastive learning. Most significantly, extensive experiments on CIFAR10-LT, CIFAR100-LT, ImageNet-LT and iNaturalist 2018 datasets demonstrate our model not only achieves superior performance but also retains the benefits of end-to-end training. Specifically, our method achieves 60.7% accuracy on ImageNet-LT with an end-to-end ResNeXt-50 backbone.
Loading