VSDM: Variable-Scale Diffusion Model Based on Dynamic Condition Guidance for Pansharpening

Yong Yang, Mengzhen Li, Shuying Huang, Weiguo Wan, Hangyuan Lu, Wei Tu

Published: 2024, Last Modified: 28 Jan 2025IEEE Trans. Geosci. Remote. Sens. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Pansharpening aims to obtain a high-spatial-resolution multispectral (MS) image by fusing a lower-spatial resolution MS image with a high-spatial-resolution panchromatic (PAN) image. Currently, the results obtained by most pansharpening methods still suffer from spatial and spectral distortion issues. The diffusion model has shown outstanding performance in various image-processing tasks. However, maintaining the full image size throughout the diffusion process imposes a large computational burden, and the simultaneous use of PAN and MS images acquired by different sensors as a condition for guiding noise prediction leads to spatial and spectral distortions. To solve these problems, a variable-scale diffusion model (VSDM) based on dynamic condition guidance for pansharpening is proposed, which achieves better fusion performance by improving the diffusion manner of the diffusion model and injecting dynamic conditions to guide the reverse process. In VSDM, a variable-scale diffusion manner (VSDMN) is designed to reduce the computational complexity of the model by reducing the size of the image in the diffusion process. A condition generator (CG) is constructed to generate dynamic conditions using the features learned from the PAN and upsampled MS images. In CG, a cross-attention dynamic convolution is built to extract features from the PAN image by designing a spatial and spectral attention mechanism, which can improve the spatial and spectral consistency in the dynamic condition. Extensive experiments validate the effectiveness of the proposed VSDM against other state-of-the-art (SOTA) pansharpening methods in both quantitative and qualitative assessments. The source code will be released at https://github.com/MELiMZ/VSDM.