UCPM: Uncertainty-Guided Cross-Modal Retrieval With Partially Mismatched Pairs

Quanxing Zha, Xin Liu, Yiu-Ming Cheung, Shu-Juan Peng, Xing Xu, Nannan Wang

Published: 01 Jan 2025, Last Modified: 30 Jun 2025IEEE Trans. Image Process. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The manual annotation of perfectly aligned labels for cross-modal retrieval (CMR) is incredibly labor-intensive. As an alternative, the collection of co-occurring data pairs from the Internet is a remarkably cost-effective way, but which, inevitably induces the Partially Mismatched Pairs (PMPs) and therefore significantly degrades the retrieval performance without particular treatment. Previous efforts often utilize the pair-wise similarity to filter out the mismatched pairs, and such operation is highly sensitive to mismatched or ambiguous data and thus leads to sub-optimal performance. To alleviate these concerns, we propose an efficient approach, termed UCPM, i.e., Uncertainty-guided Cross-modal retrieval with Partially Mismatched pairs, which can significantly reduce the adverse impact of mismatched data pairs. Specifically, a novel Uncertainty Guided Division (UGD) strategy is sophisticatedly designed to divide the corrupted training data into confident matched (clean), easily-identifiable mismatched (noisy) and hardly-determined hard subsets, and the derived uncertainty can simultaneously guide the informative pair learning while reducing the negative impact of potential mismatched pairs. Meanwhile, an effective Uncertainty Self-Correction (USC) mechanism is concurrently presented to accurately identify and rectify the fluctuated uncertainty during the training process, which further improves the stability and reliability of the estimated uncertainty. Besides, a Trusted Margin Loss (TML) is newly designed to enhance the discriminability between those hard pairs, by dynamically adjusting their soft margins to amplify the positive contributions of matched pairs while suppressing the negative impacts of mismatched pairs. Extensive experiments on three widely-used benchmark datasets, verify the effectiveness and reliability of UCPM compared with the existing SOTA approaches, and significantly improve the robustness in both synthetic and real-world PMPs. The code is available at: https://github.com/qxzha/UCPM