CLIP2LE: A Label Enhancement Fair Representation Method via CLIP

Pu Wang, YinSong Xiong, Zhuoran Zheng

Published: 2026, Last Modified: 12 Mar 2026IEEE Trans. Big Data 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Label enhancement is a novel label shift strategy that aims to integrate the feature space with the logical label space to obtain a high-quality label distribution. This label distribution can serve as a soft target for algorithmic learning, akin to label smoothing, thereby enhancing the performance of various learning paradigms including multi-label learning, single positive multi-label learning, and partial-label learning. However, limited by dataset type and annotation inaccuracy, the same label enhancement algorithm on different datasets struggles to achieve consistent performance, for reasons derived from the following two insights: 1) Differential Contribution of Feature Space and Logical Label Space: The feature space and logical label space of different datasets contribute differently to generating an accurate label distribution; 2) Presence of Noise and Incorrect Labels: Some datasets contain noise and inaccurately labeled samples, leading to divergent outputs for similar inputs. To address these challenges, we propose leveraging CLIP (Contrastive Language-Image Pre-training) as a foundational strategy, treating the feature space and the logical label space as two distinct modalities. By recoding these modalities before applying the label enhancement algorithm, we aim to achieve a fair and robust representation. In addition, we further explained the reasonableness of our motives in the discussion session. Extensive experimental results demonstrate the effectiveness of our approach to help existing label enhancement algorithms improve their performance on several benchmarks.

External IDs:dblp:journals/tbd/WangXZ26