Abstract: Pansharpening aims to reconstruct a high-fidelity multispectral (HR-MS) image by fusing a multispectral (MS) image and a panchromatic (PAN) image. However, conventional pansharpening methods often struggle to address the modal gap between PAN and MS images. In this paper, we propose a novel cross-modal conditional fusion network (CroCFuN) for pansharpening, which builds upon recent advancements in optimal transport (OT) theory. Specifically, we formulate the modal alignment in pansharpening as an OT problem and thus design a feature alignment module (FAM) to adjust the MS image features to align with the PAN image features. Meanwhile, we propose a feature-activation normalization fusion module (FANFM), which adopts the multi-stage conditional fusion strategy to generate high-quality fusion features. Experimental results demonstrate that our CroCFuN outperforms recent representative methods for pan-sharpening regarding both visual and quantitative qualities. Code will be available at https://github.com/Florina2333.
Loading