Abstract: Infrared small target detection (IRSTD) has progressed significantly in spatial-domain learning. However, single-frame images’ limited spatial semantics impair discrimination between targets and similar noise while complicating integrity detection of large-scale target. To address these, we propose global spatial–frequency attention network (GSFANet), which enhances the distribution difference between targets and noise from a frequency-domain perspective while preserving spatial information integrity. The core innovations consist of three modules: 1) parametric wavelet downsampling (PWD), preserving small target details during frequency refinement to prevent feature fragmentation; 2) hierarchical gated kernel attention (HGKA), capturing cross-level frequency relationships through cross-channel kernel attention ( $\text {C}^{2}\text {K}$ ) and maintaining spatial coherence via cross-spatial gate attention (CSG), effectively bridging semantic gaps across layers; and 3) adaptive frequency-decoupled fusion (AdaFD), dynamically fusing target-associated frequency components while suppressing noise. We further develop AdaFL Loss to balance multiscale target gradients and stabilize training. Experiments on three benchmark datasets demonstrate GSFANet’s superior detection performance and enhanced segmentation robustness in complex scenarios compared with state-of-the-art (SOTA) methods. Our code will be made public at https://github.com/dengfa02/GSFANet_IRSTD
External IDs:dblp:journals/tgrs/DengZXXLP25
Loading