Multilevel Temporal-Spectral Fusion Network for Multivariate Time Series Classification

Published: 01 Jan 2024, Last Modified: 20 May 2025IJCNN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multivariate time series classification (MTSC) plays important roles in a large variety of applications, including human activity recognition, acoustic scene classification, and electronic health. Most of the existing approaches exploit either temporal or spectral features of the input time series data but neglect the essential correlation between these two types of features. To address this limitation, we propose a multilevel temporal-spectral fusion network (called MTSFNet) that can effectively fuse both temporal and spectral features. The main steps of MTSFNet include: i) we first extract multilevel spectral signals from the input data using wavelet transform networks, which were further encoded into embedding vectors using a reduction encoder; ii) we fuse the temporal and multilevel spectral features to exploit the correlation between using cross-attention mechanism for classification. Experimental results on ten popular datasets from the UEA archive suggest that our method outperformed the state-of-the-art methods by an average accuracy improvement of 4.3%. Deep ablation experiments show that using multilevel wavelet transform networks can effectively improve the classification accuracy, where the three-level wavelet transform has the highest average classification accuracy, reaching 76.8%. This observation clearly demonstrates the advantages of our multilevel feature extraction and temporal-spectral fusion. We anticipate that the use of MTSFNet will greatly facilitate the analysis of time-series data in practice.
Loading