Deep clustering for large-scale interpretable time series segmentation

Huiping Cao, Erick Draayer, Qixu Gong

Published: 08 Nov 2025, Last Modified: 10 Feb 2026Springer Data Mining and Knowledge DiscoveryEveryoneCC BY 4.0

Abstract: Time series segmentation (TSS) is often an unsupervised data mining task that partitions a given time series into homogeneous regions. Existing TSS algorithms either scale poorly or perform poorly on complex large-scale time series (TS) commonly observed in real-world applications. This paper introduces Deep Clustering for Time Series Segmentation (DC-TSS). DC-TSS is a domain-agnostic method that uses a three-phase neural-based model to segment a given time series. DC-TSS includes a carefully designed neural architecture and a newly designed data augmentation approach to efficiently learn TS representations, utilizes a neural-based clustering model to refine such representations, designs a novel efficient component to infer segments from clustered TS representation, and provides mechanisms to understand/interpret the segmentation results. We test DC-TSS on 27 multivariate time series datasets, which are much larger and more complex than others typically used in TSS studies. We also test DC-TSS on a more traditional repository of 98 simpler time series datasets. The experiments from both types of dataset provide an in-depth analysis of DC-TSS’s performance and limitations. We compare five variations of our method against seven strong baselines. The results show that DC-TSS significantly outperforms other methods and scales well to larger and more complex datasets and shows some limitation on shorter simple datasets. DC-TSS addresses a growing need for unsupervised TSS algorithms designed to segment large-scale, complex datasets, which are becoming more common as evolving technology allows collecting and storing greater volumes of data.