Cross-Modal Contrastive Pansharpening via Uncertainty Guidance

Haoying Zeng, Xiaoyuan Yang, Kangqing Shen, Yixiao Li, Jin Jiang, Fangyi Li

Published: 01 Jan 2025, Last Modified: 14 Oct 2025CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Deep learning (DL)-based pansharpening has been widely applied in high-resolution imaging. Yet, artifacts related to generalization and oversmoothing have continuously been the challenge, primarily due to the mismatch between the simulation dataset and the unseen real-world scenarios. Current approaches address these through unsupervised frameworks or generative models, while modal inconsistency is not fully considered, leading to suboptimal performance. In this article, we propose a contrastive cross-modal framework via uncertainty guidance (UGCC), which comprises three key modules: a contrast feature enhancement module (CFEM), a cross-modal compensation module (CMCM), and an uncertainty guidance module (UGM). First, to enhance generalization and reduce overfitting, CFEM is introduced. Robust contrast features are augmented and learned sparsely in latent space, where sample distributions are refined, and redundant information is filtered from highly similar sample pairs for enhanced training stability. Furthermore, CMCM mitigates modal inconsistency effectively by domain transfer and collaborative attention, achieving efficient modal separation and interaction. Finally, to adaptively balance the performance of CMCM and CFEM based on prediction confidence, a hybrid loss function is designed, where UGM adjusts the weights through quantifying statistical-versus-structural uncertainties. Extensive experiments on Quickbird, Gaofen-2, WorldView-2, and WorldView-3 demonstrate that the performance of the proposed method surpasses or matches the state of the arts. Furthermore, ablation studies validate the effectiveness of each component. The code is now available at: https://github.com/meimeizeng/UGCF.

External IDs:doi:10.1109/tgrs.2025.3555610