Revisiting Time Series Outlier Detection: Definitions and Benchmarks

Kwei-Herng Lai; Daochen Zha; Junjie Xu; Yue Zhao; Guanchu Wang; Xia Hu

Revisiting Time Series Outlier Detection: Definitions and Benchmarks

Kwei-Herng Lai, Daochen Zha, Junjie Xu, Yue Zhao, Guanchu Wang, Xia Hu

Published: 29 Jul 2021, Last Modified: 24 May 2023NeurIPS 2021 Datasets and Benchmarks Track (Round 1)Readers: Everyone

Keywords: outlier detection, time series data, benchmark

TL;DR: Revisiting time series outlier definitions and benchmarking the synthetic criterion and the existing algorithms with a behavior-driven taxonomy.

Abstract: Time series outlier detection has been extensively studied with many advanced algorithms proposed in the past decade. Despite these efforts, very few studies have investigated how we should benchmark the existing algorithms. In particular, using synthetic datasets for evaluation has become a common practice in the literature, and thus it is crucial to have a general synthetic criterion to benchmark algorithms. This is a non-trivial task because the existing synthetic methods are very different in different applications and the outlier definitions are often ambiguous. To bridge this gap, we propose a behavior-driven taxonomy for time series outliers and categorize outliers into point- and pattern-wise outliers with clear context definitions. Following the new taxonomy, we then present a general synthetic criterion and generate 35 synthetic datasets accordingly. We further identify 4 multivariate real-world datasets from different domains and benchmark 9 algorithms on the synthetic and the real-world datasets. Surprisingly, we observe that some classical algorithms could outperform many recent deep learning approaches. The datasets, pre-processing and synthetic scripts, and the algorithm implementations are made publicly available at https://github.com/datamllab/tods/tree/benchmark

Supplementary Material: zip

URL: https://github.com/datamllab/tods/tree/benchmark

Contribution Process Agreement: Yes

Dataset Url: https://github.com/datamllab/tods/tree/benchmark/benchmark

License: Apache-2.0

Author Statement: Yes

12 Replies

Loading