Abstract: Infrared small targets often exhibit small scale and weak semantic features, which makes it a great challenge to their detection. To address this situation, we propose a novel network for infrared small target detection that combines local details information and global contextual information. To preserve the local and high-frequency details present in infrared images, we introduce a High-frequency Aware Encoder. To extract contextual information from multi-scale feature maps, we propose a Multi-scale Context Learning Bottleneck that incorporates contextual information repeatedly and performs cross-level fusion, which enables the recognition of small targets based on their surroundings. Finally, a lightweight Transformer Decoder is employed to restore the feature map, while placing attention on the target pixels. Experimental results on the IRSTD-1k dataset demonstrate that our method outperforms other state-of-the-art approaches.
Loading