Aligning time series anomaly detection research with practical applications

TMLR Paper5931 Authors

19 Sept 2025 (modified: 01 Oct 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The field of time series anomaly detection is hindered not by its models and algorithms, but rather by its inadequate evaluation methodologies. A growing number of researchers have claimed in recent years that various prevalent metrics, datasets, and benchmarking practices employed in the literature are flawed. In this paper, we echo this sentiment by demonstrating that widespread metrics are incongruent with desirable model behaviour in practice and that datasets are plagued by inaccurate labels and unrealistic anomaly density, amongst other issues. Furthermore, we provide suggestions and guidance on realigning theoretical research with the demands of practical applications, with the goal of establishing a stable, principled benchmarking framework within which models may be evaluated and compared fairly. Finally, we offer a perspective on the main challenges and unanswered questions in the field, alongside potential future research directions.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yiming_Ying1
Submission Number: 5931
Loading