Rethinking Evaluation of Infrared Small Target Detection

Youwei Pang; Xiaoqi Zhao; Lihe Zhang; Huchuan Lu; Georges El Fakhri; Xiaofeng Liu; Shijian Lu

Rethinking Evaluation of Infrared Small Target Detection

Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu, Georges El Fakhri, Xiaofeng Liu, Shijian Lu

Published: 18 Sept 2025, Last Modified: 30 Oct 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0

Keywords: Evaluation, Metric, Infrared Small Target Detection, IRSTD

TL;DR: This paper addresses limitations in infrared small target detection (IRSTD) evaluation, aiming to establish a comprehensive hierarchical evaluation framework for IRSTD models.

Abstract: As an essential vision task, infrared small target detection (IRSTD) has seen significant advancements through deep learning. However, critical limitations in current evaluation protocols impede further progress. First, existing methods rely on fragmented pixel- and target-level specific metrics, which fails to provide a comprehensive view of model capabilities. Second, an excessive emphasis on overall performance scores obscures crucial error analysis, which is vital for identifying failure modes and improving real-world system performance. Third, the field predominantly adopts dataset-specific training-testing paradigms, hindering the understanding of model robustness and generalization across diverse infrared scenarios. This paper addresses these issues by introducing a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation. These aim to offer a more thorough and rational hierarchical analysis framework, ultimately fostering the development of more effective and robust IRSTD models. An open-source toolkit has be released to facilitate standardized benchmarking.

Croissant File: json

Dataset URL: https://huggingface.co/datasets/yooweey/AugmentedIRSTD1kTestset

Code URL: https://github.com/lartpang/PyIRSTDMetrics

Primary Area: Datasets & Benchmarks for applications in computer vision

Submission Number: 1346

Loading