CLDiff: Weakly Supervised Cloud Detection With Denoising Diffusion Probabilistic Models

Published: 01 Jan 2024, Last Modified: 16 May 2025IEEE Trans. Geosci. Remote. Sens. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Cloud detection is an essential step in remote sensing (RS) image processing, contributing to various applications. However, existing fully supervised cloud detection methods rely on massive pixel-wise annotations, which are expensive and time-consuming. To alleviate the annotation burden, weakly supervised cloud detection (WSCD) has received extensive attention recently. One standard approach performs cloud detection within a classification paradigm, which inevitably faces category ambiguity when detecting semitransparent clouds. To tackle this problem, we propose a novel WSCD framework based on the diffusion model, termed CLDiff. Specifically, a multiscale feature rectification (MFR) module is introduced to extract multiscale semantic features in the encoder, enabling a definite identification of clouds and mitigating interference from bright objects in the background. Considering that clouds exhibit varying optical thicknesses, a diffusion decoder is developed to model the intraclass variations of clouds in a generative strategy, improving thin cloud detection. Initially, it devises a Gaussian modulation function to recalibrate ambiguous cloud activations and emphasize semitransparent clouds. Subsequently, these modulated activations serve as semantic guidance to optimize the diffusion process. This approach enables CLDiff to activate cloud contours under definite semantic conditions and avoids the additional branches for semantic learning as found in previous methods. Experimental results demonstrate that CLDiff achieves state-of-the-art performance in WSCD. A public reference implementation of this work in PyTorch is available at https://github.com/YLiu-creator/CLDiff .
Loading