Abstract: Semantic segmentation of Remote Sensing Images (RSIs) entails assigning semantic labels to each pixel accurately. RSIs are rich in spatial and spectral data, revealing diverse material and object characteristics. Yet, current RSI-focused computer vision models struggle with significant intra-class variation and inter-class resemblance due to limited spectral data usage. We propose the Frequency Domain Feature-Guided Network (FFGNet) for RSI semantic segmentation, influenced by digital signal processing theories. FFGNet initially generates frequency domain features via patch partitioning and 2D discrete cosine transformation. Our Frequency Enhancement Attention module (FEA) then distinguishes and intensifies frequency components to retain detailed information. These enhanced features are integrated with the Spatial-Spectral Attention (SSA) for enriched spectral signals. In the inference phase, these features are upsampled and combined with decoded features, emphasizing spectral details. Additionally, our novel loss function combines frequency and cross-entropy losses. Experiments on LoveDA and ISPRS Potsdam datasets demonstrate FFGNet's effectiveness, surpassing other mainstream models. An ablation study further validates our dual-guidance design.
Loading