Time-Series Clustering: A Comprehensive Study of Data Mining, Machine Learning, and Deep Learning Methods

Published: 01 Sept 2025, Last Modified: 15 Aug 2025Proc. VLDB Endow. 2025EveryoneRevisionsCC BY-SA 4.0
Abstract: Time-series clustering is a key task in time series analysis, enabling unsupervised data exploration and often serving as a subroutine for other tasks. Despite decades of active cross-disciplinary research, benchmarking of time-series clustering methods has received limited attention. Existing studies have (i) excluded popular methods and entire method classes; (ii) used a narrow range of distance measures; (iii) evaluated only a few datasets; (iv) lacked statistical validation; (v) had poor reproducibility; or (vi) relied on questionable evaluation setups. The rise of deep learning---especially foundation models claiming broad generalization---further emphasizes the need for comprehensive evaluation, as their role in time-series clustering remains largely untested. To address these gaps, we evaluate 84 time-series clustering methods across 10 method classes from data mining, machine learning, and deep learning. Our analysis spans 128 time-series datasets and uses rigorous statistical methods. Within a fair comparison framework, we (i) identify the top-performing method in each class; (ii) highlight previously overlooked, high-performing classes; (iii) challenge assumptions about elastic distance measures; (iv) refute the claimed superiority of deep learning methods, including foundation models; (v) expose reproducibility issues; (vi) analyze performance variation across dataset properties; and (vii) assess scalability. Our findings reveal an illusion of progress: no method significantly outperforms the decade-old $k$-Shape method. Still, we highlight a deep learning-based approach with notable promise. Our results provide a strong benchmark for advancing time-series clustering, and we have open-sourced our work to support future research.
Loading