Sparse Self-Attention Guided Generative Adversarial Networks for Time-Series Generation

Nourhan Ahmed; Lars Schmidt-Thieme

Sparse Self-Attention Guided Generative Adversarial Networks for Time-Series Generation

Nourhan Ahmed, Lars Schmidt-Thieme

Published: 01 Jan 2023, Last Modified: 10 Sept 2024DSAA 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Remarkable progress has been achieved in generative modeling for time-series data with the introduction of Generative Adversarial Networks (GANs) [1]. GANs are neural networks that are meant to generate synthetic instances of data utilizing two neural networks, a generator and a discriminator, that operate against each other at the same time [1]. The generator learns to generate fake data to get the discriminator to classify its generated samples as authentic. The discriminator, on the other hand, attempts to distinguish between authentic and produced data. Finally, the generator could generate realistic data. GANs have demonstrated their ability to generate realistic data and have made remarkable progress in various tasks, such as the generation of time-series [4], images [5], and videos [3]. Particularly, a significant amount of work has utilized GANs based on Recurrent Neural Networks (RNNs) for time-series generation [4]. However, by carefully examining the generated samples from these models, we can observe that RNN-based GANs, such as LSTM GANs and gated recurrent GANs, cannot handle long sequences. Although RNN-based GANs can generate many realistic samples, there is still a difficulty in training due to exploding vanishing gradients and mode collapse that limits their generation capability. In addition, these RNN-based GANs are typically designed for regular time-series data, and thus cannot maintain informative varying intervals properly, which is a major concern for generating time-series data.In this paper, we propose SparseGAN, a novel sparse self-attention-based GANs that allows for attention-driven, long-memory modeling for regular and irregular time-series generation through learned embedding space. This way, it can yield a more informative representation and capture long-range dependencies for time-series generation while using original data for supervision. SparseGAN comprises two essential sub-networks: the Supervision Network and the Generation Network. These networks collaborate in an end-to-end manner to generate realistic time-series data.The Supervision Network is an encoder-decoder network, employing Gated Recurrent Unit (GRU) cells for feature extraction. It takes input time-series data and produces a latent feature representation followed by a decoder that reconstructs the time-series data. This reconstruction is crucial for facilitating the generation task and reducing data complexity. Using reconstructed data, instead of actual data, helps the generation network learn underlying data dynamics more effectively. The Supervision Network aims to minimize a reconstruction loss defined as the expected difference between the input data and the network’s output.The Generation Network consists of a generator and a discriminator following the standard GAN architecture. The generator takes random noise as input and attempts to transform it into realistic time series, while the discriminator aims to distinguish between real and generated time series. The adversarial loss is used to train the generator and discriminator. This loss measures how well the discriminator can distinguish between real and generated data.To handle irregular and sparse time-series data, the proposed SparseGAN introduces a Sparse Self-Attention Module. Traditional self-attention mechanisms calculate dense dependencies between all pairs of time steps, which is inefficient and often fails to assign zero probability to less significant relationships. Sparse Self-Attention employs an $\alpha$-entmax transformation [2] to replace the standard softmax, enabling sparser probability distributions and reducing computation complexity.Sparse Self-Attention GANs incorporate the Sparse Self-Attention Module into the generator and discriminator architectures. The generator employs a stack of sparse self-attention layers, followed by a fully connected feed-forward network, while the discriminator also uses self-attention layers with $\alpha$-entmax transformation. This modification improves the generation of time-series data by accurately modeling the distribution of real data.The Supervision Network and Generation Network are jointly trained in an end-to-end manner. This approach ensures that the generated data aligns with the real data distribution. To enhance the quality of generated data, the Supervision Network’s output, C, is used to supervise the generator. This supervision loss, $L_{supervise}$, measures the difference between the reconstructed data and the generated data. The final objective function, L*, combines the reconstruction loss, adversarial loss, and supervision loss, with a hyperparameter $\lambda_{s}$ controlling the contribution of the supervision loss.For evaluation, we evaluate the effectiveness of the proposed model using synthetic and real-world datasets. We also conducted a series of experiments to evaluate the effectiveness of SparseGAN in addressing key challenges in time-series data generation. Firstly, we used a diverse range of time-series data types, including regular and irregular time series. These datasets included synthetic sine waves, daily Google stocks data, energy consumption data, power consumption data with irregular sampling, and air quality data with irregular sampling. To benchmark SparseGAN’s performance, we compared it against various baseline models, including TimeGAN [4] and several baseline models, to assess its data generation quality and diversity.The evaluation process comprised two key aspects: fidelity and diversity. Fidelity was assessed by conducting experiments to augment real-world datasets with SparseGAN-generated data and measuring the resulting improvement in time-series forecasting accuracy, especially in low-data scenarios. The results consistently showed that SparseGAN significantly improved forecasting accuracy compared to other models across various datasets.The diversity aspect was evaluated by examining how well SparseGAN-generated data preserved the diversity and patterns of the original data. The findings indicated that SparseGAN-generated data closely resembled the characteristics of real-world time series and was virtually indistinguishable from actual data.Furthermore, we conducted a sensitivity analysis to test the robustness of their findings. We investigated the effect of different attention mechanisms, with 1.5 entmax emerging as more beneficial for the model’s performance compared to softmax.In conclusion, SparseGAN proved to be a promising generative model for time-series data. It effectively addressed long-term dependencies, maintained data distribution characteristics, and outperformed existing models in terms of data quality and diversity. SparseGAN’s ability to improve time-series forecasting accuracy, particularly in low-data scenarios, underscores its potential for various practical applications.

Loading