Spatio-Temporal Data Augmentation Method for Network Traffic Prediction

Sung Oh, Myeong-Jun Oh, Jong-Kyung Im, Ji-Yeon Park, Joung-Sik Kim, Na-Rae Yi, Myung-Ho Kim, Sung-Ho Bae

Published: 01 Jan 2025, Last Modified: 27 Feb 2026IEEE AccessEveryoneRevisionsCC BY-SA 4.0
Abstract: Enhancing the generalization performance of CNN-based deep learning models for network traffic prediction requires a substantial amount of data. However, collecting large-scale network traffic data is costly, making data augmentation a practical alternative for improving model performance without additional data acquisition costs. Network traffic data inherently possesses both spatial and temporal characteristics, necessitating data augmentation methods that effectively incorporate both aspects. Despite this need, existing studies have largely overlooked data augmentation techniques that simultaneously address spatial and temporal features. Moreover, network traffic data often exhibits localized and granular patterns, meaning that augmented data with significant spatial deviations from the original distribution can undermine structural consistency, leading to severe performance degradation in prediction models. To address these challenges, we propose a novel spatio-temporal data augmentation framework based on a UNet architecture. The proposed method treats UNet as an inpainting model to generate augmented data by masking and reconstructing partial regions of input samples. A spatial attention module is embedded within the UNet structure to better capture localized features, while the temporal aspect is modeled by providing a sequence of n consecutive time steps as conditional input. This enables the model to generate spatially and temporally coherent augmented samples. By leveraging this inpainting-based approach, we are able to generate subtle augmentations that maintain structural consistency and temporal context, ultimately leading to improved prediction performance. All experiments were conducted on the Telecom Italia Milano dataset, a widely used real-world network traffic dataset. Experimental results demonstrate that the proposed method consistently outperforms baseline models and conventional augmentation techniques, achieving performance improvements in MSE of 15.07% in SMS, 32.7% in CALL, and 10.15% in INTERNET traffic prediction.
Loading