Decision Support Model for Time Series Data Augmentation Method Selection

Published: 01 Jan 2024, Last Modified: 02 Aug 2025IEEE Access 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Data augmentation (DA) plays a crucial role in machine learning by improving model generalization and tackling data scarcity issues, particularly prevalent in domains with limited access to sensitive information or rare events. Despite the availability of various DA techniques for handling imbalanced time-series classification (ITSC) problems, there is a lack of comprehensive guidelines for selecting the most appropriate technique based on input data features and the chosen classifier. This paper empirically demonstrates the limitations of conventional data balancing practices through experiments conducted on 720 ITSC datasets, using 7 classifier architectures and 6 DA techniques (TimeGAN, SMOTE, ADASYN, Random Oversampling, Jittering, Time Warping). Our study not only explores the relationship between DA techniques and the inherent characteristics of ITSC datasets and classifiers but also introduces a novel ML-based decision support system, BALANCER (imBALanced AugmeNtation reCommendER), which has been trained based on empirical data to offer an automated approach for ML practitioners to select the most appropriate DA method for their own/specific application. BALANCER’s recommendation model comes with a prediction of the performance enhancement that is expected from data balancing using the recommended method. Evaluation of BALANCER against traditional mean rank recommendations reveals significant improvements, with BALANCER achieving an average Kendall’s tau of 0.36 (compared to −0.01 for traditional mean rank recommendations) and a root mean square error of $1.5\times 10^{-2}$ on individual predictions. The reasons behind the notable disparity in results between the mean rank recommendation strategy and BALANCER are analyzed using eXplainable AI (XAI), demonstrating that BALANCER can uncover deeper and more complex feature interactions compared to a mean rank recommendation-like strategy.
Loading