IPS: Instance Profile for Shapelet Discovery for Time Series Classification

Published: 01 Jan 2022, Last Modified: 22 Jun 2025ICDE 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Time series classification (TSC) has been one of the most fundamental problems of time series data. Time series shapelets (or simply, shapelets) are discriminative subsequences that have been recently found both effective and interpretable for solving TSC. However, shapelet discovery is known to be computationally costly. Meanwhile, matrix profile has been recently proposed for efficient motif discovery and anomaly detection. Our preliminary experiment shows that a direct adoption of the matrix profile on TSC does not bring superior classification accuracy. We have identified two main issues of such an adoption: 1) discords as “shapelets”, and 2) lack of shapelet diversity. In response to these issues, we propose instance profile for shapelets, called IPS, for shapelet discovery for TSC. The main challenge is to utilize the instance profile (IP) to capture the characteristics of shapelets in a robust manner and then to discover high-quality shapelets efficiently. First, we use our IP to generate abundant shapelet candidates. We next efficiently prune candidates that do not align with the definition of shapelets using a novel distribution-aware bloom filter (DABF). Three utility functions are proposed to measure the shapelet candidates and DABF is used to efficiently compute the functions. We have conducted comprehensive experiments on IPS with 12 competitive state-of-the-art methods using UCR Archive datasets. The efficiency is on average 25 times faster than that of BSPCOVER (the current state-of-the-art method). The accuracy of IPS is comparable to or higher than that of existing work. Furthermore, we select one case study to illustrate the interpretability of the shapelets.
Loading