Abstract: Semantic segmentation of remote sensing images (RSIs) is critical for various applications, including urban planning, agriculture, and disaster management. Existing methods often fail to capture fine-grained textures and periodic patterns in RSIs, leading to suboptimal results in complex terrains. To address these challenges, we propose a cross-domain coupling network (CDCNet) that leverages both domain-specific extraction and cross-domain coupling (CDC) to enrich contextual cues for semantic inference. Our CDCNet integrates a CDC layer within the encoder-decoder architecture to simultaneously refine representations in the frequency and spatial domains. This approach effectively models fine-grained textures and periodic patterns in the frequency domain, as well as edges, shapes, and broad structural elements in the spatial domain. Extensive experiments on the ISPRS Potsdam and LoveDA datasets demonstrate the superiority of CDCNet over several state-of-the-art methods. Ablation studies confirm the significant impact of the CDC layer, validating the effectiveness of our approach in handling RSIs.
Loading