Diff-HRNet: A Diffusion Model-Based High-Resolution Network for Remote Sensing Semantic Segmentation

Zhipeng Wu, Chang Liu, Bingze Song, Huaxin Pei, Pinjie Li, Mengshuo Chen

Published: 01 Jan 2025, Last Modified: 04 Nov 2025IEEE Geosci. Remote. Sens. Lett. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The semantic segmentation methods based on deep neural networks predominantly employ supervised learning, relying heavily on the quantity and quality of annotated samples. Due to the complexity of high-resolution remote sensing imagery, obtaining sufficient and precise pixel-level labeled data is highly challenging. This letter introduces a novel self-supervised learning method using a pretrained denoising diffusion probabilistic model (DDPM) to leverage semantic information from large-scale unlabeled remote sensing imageries. Building on this, a multistage fusion scheme between pretrained features and high-resolution features is proposed, enabling the network to learn more effective strategies to leverage prior information provided by the pretrained model while preserving the rich semantic details of high-resolution images. Experimental results on two remote sensing semantic segmentation datasets show that the proposed Diff-HRNet outperforms all compared methods, demonstrating the potential of pretrained diffusion models in extracting crucial feature representations for semantic segmentation tasks.

External IDs:dblp:journals/lgrs/WuLSPLC25