MASDG: Multiview Augmented Single-Source Domain Generalization Method for Robust Remote Sensing Building Extraction

Yunjiao Liu, Yuanyuan Liu, Kejun Liu, Yuxuan Huang, Chang Tang, Wujie Zhou, Zhe Chen, Wei Xiang, Hongyan Zhang

Published: 2025, Last Modified: 15 Jan 2026IEEE Trans. Geosci. Remote. Sens. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Despite advances in deep learning for remote sensing building extraction (RSBE), multitarget domain RSBE (MD-RSBE) remains challenging, as it requires transferring knowledge from a labeled source domain to multiple unlabeled target domains, with domain shifts in texture, style, and semantics. Existing domain adaptation (DA) and generalization (DG) methods face significant limitations: DA requires target-domain training, while DG needs multisource training, leading to high training costs and low generalization in practical MD-RSBE scenarios. To address this, we propose a multiview augmented single-source DG (MASDG) method, which effectively mitigates domain shifts across the RS source and target domains for robust MD-RSBE performance by enriching the diversity of the source domain through multiview augmentation and enforcing semantic consistency. Specifically, MASDG consists of three key components: texture-level domain augmentation (TDA) module, style-level domain augmentation (SDA) module, and semantic-invariant representation learning (SRL). To mitigate texture-level domain shift, TDA first introduces parameter-optimized multilayer random convolution to modify the texture of the source image, generating texture-augmented image pairs for simulating real-world texture diversity across various RS domains. Then, with each image pair from TDA, SDA employs two paralleled encoders, namely, the general feature encoder and the batch-guided style encoder, to formulate multiview building features, further mitigating style-level domain shift. Finally, SRL ensures SRL via a dual mechanism, including multiview segmentation loss and semantic consistency loss. The former generates predictions from diverse feature views (original, texture-augmented, and style-augmented), while the latter performs semantic alignment by minimizing distribution discrepancies among predictions, bridging semantic inconsistency to enable robust segmentation. Extensive experiments across three different MD-RSBE settings with seven different target domains demonstrate that our MASDG outperforms existing state-of-the-art methods by a significant margin.

External IDs:dblp:journals/tgrs/LiuLLHTZCXZ25