Focusing on neglected natural images: A self-supervised learning model for pan-sharpening

Libo Zhao, Xiaoli Zhang, Zeyu Wang

Published: 01 Jan 2025, Last Modified: 10 Jul 2025Inf. Process. Manag. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Pan-sharpening integrates high-resolution panchromatic (PAN) and low-resolution multispectral (LRMS) images to produce high-resolution multispectral (HRMS) data. Existing deep learning methods suffer from the scarcity of satellite datasets and limitations of traditional loss functions that capture only local correlations. To address these limitations, we propose a two-stage self-supervised pan-sharpening framework. In the pretext stage, we introduce a pseudo-label generation strategy that synthesizes realistic MS-PAN pairs from 10,000 natural RGB images, significantly reducing reliance on large-scale satellite datasets. Subsequently, in the downstream stage, the model is fine-tuned using only 675 real satellite image pairs, enhanced by incorporating a frozen Transformer-based structural loss to ensure global and local consistency. Extensive experiments demonstrate our method outperforms 13 state-of-the-art methods, achieving improvements of up to 5.2% in Spectral Angle Mapper (SAM) and 1.2% in Spatial Correlation Coefficient (SCC) on reduced-resolution datasets, and up to 6.4% in Quality with No Reference (QNR) on full-resolution datasets. Visually, the proposed framework effectively preserves spatial detail and reduces spectral distortion. Additionally, it exhibits robust performance in normalized difference vegetation index (NDVI) applications and demonstrates promising generalization capabilities to medical image fusion tasks.