Abstract: The long-tailed data distribution frequently occurs in the real-world scenarios, whereas deep learning is not effective enough for such distribution. In order to improve the effectiveness for the long-tailed data, data augmentation is widely used to balance the distribution of classes by generating new samples. However, most existing studies are designed from the perspective of the class-independence assumption by default, ignoring the effect of interrelation among classes for data augmentation, which causes that some generated samples may be unrepresentative and useless for balancing the class-distribution. Inspired by this, we propose a new data augmentation method based the sparse class-correlation exploitation in this paper, which can generate more representative samples by utilizing the class-correlation, to effectively balance the class-distribution for the long-tailed data. In the proposed method, a sparse class-correlation exploration module is first proposed to explore the potential correlations among multiple classes for boosting the classification performance. Based on the class-correlations, the pivotal seed-samples are generated by maximizing the sparse representation of challenging samples. Meanwhile, an ambiguity-filtered translation module is designed to generate more representative new samples for the target classes based the obtained seed-samples by enhancing the class-consistency and suppressing the deviation from the target classes. In addition, we introduce the self-supervised feature and fuse it with the discriminative feature to explore more accurate class-correlations. Experimental results illustrate that the proposed method obtains better performance only with a small number of generated samples than the state-of-the-art methods.
External IDs:dblp:journals/tkde/QiMZGGJZ25
Loading