HSTNet: A Hybrid Spatial-Channel Sparse Transformer Network for Infrared Small Target Detection

Published: 2025, Last Modified: 10 Jan 2026IEEE Trans. Geosci. Remote. Sens. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Infrared small target detection (IRSTD) has many applications in multiple secure fields, e.g., uncrewed aerial vehicle (UAV) systems, and low-altitude threat perception. Previous works improve the performance of IRSTD, mainly focusing on global information modeling and multiscale feature fusion. However, these works inevitably ignore the distinction between targets and backgrounds. To address this problem, we propose an effective hybrid spatial–channel sparse transformer network, HSTNet. Specifically, we first propose a hybrid spatial–channel sparse transformer (SCST) module to sparsely model the relationship between targets and background, effectively maintain long-range dependencies. Second, to preserve small target details during the feature compression process, we introduce a multiscale detail enhancement (MSDE) module. Third, we propose a scale-location aware joint (SLJ) loss to improve target perception at various scales and locations. Furthermore, to enhance the diversity and quantity of the dataset, we developed the IRSTD-Large dataset, comprising 19 558 annotated infrared images with diverse backgrounds. Finally, extensive experiments and comparisons are conducted on multiple dominant IRSTD datasets, e.g., NUAA-SIRST, IRSTD-1k, and IRSTD-Large. The results show that the proposed network surpasses current promising methods and achieves the state-of-the-art (SOTA) performance. The code is available at https://github.com/juranccc/HSTNet
Loading