Abstract: In the domain of infrared small target detection (IRSTD), the challenges revolve around detecting small and faint targets from infrared images. These targets lack distinct textures and morphology exist in complex backgrounds with numerous distractions. Current deep-learning methods typically prioritize preserving target features while neglecting the crucial background context, ultimately resulting in false alarms and miss detection. To tackle this issue, we propose a novel approach involving separately focusing on candidate target responses and background context during the encoding stage and aligning them during the decoding stage. Specifically, we introduce the progressive background-aware transformer (PBT) which adopts an asymmetric encoder-decoder architecture. The encoder with task-specific frequency domain priors extracts candidate target responses and background context features separately from shallow and deep blocks, respectively. The following hierarchical decoder progressively refines the candidate target responses under the guidance of rich background context stage by stage, leading to more accurate results. Our experiments demonstrate that PBT surpasses state-of-the-art IRSTD methods across various datasets. The code and dataset are available at https://github.com/Heron0625/PBT .
Loading