Human Activity Prediction Using Generative Adversarial Networks

Amar Shete, Aashita Gupta, Ajay Waghumbare, Upasna Singh, Triveni Dhamale, Kiran Napte

Published: 2024, Last Modified: 25 Jan 2026ICCCNT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This paper introduces an innovative approach to enhance human activity prediction by integrating Temporal Convolutional Networks (TCN) with an Encoder-Decoder model within the Generative Adversarial Network (GAN) framework. Our proposed model harnesses the power of spatio-temporal 3D convolutions to capture intricate patterns and temporal dependencies present in video data, thereby improving prediction precision and resilience. To showcase the effectiveness of our approach, we utilize the KTH dataset for action recognition tasks, demonstrating its utility in managing video data within deep learning pipelines. The KTH dataset serves as a valuable resource for simplifying data preprocessing and facilitating focused model development. At the core of our research lies the GAN-based model, which consists of a Generator and a Discriminator. The Generator is responsible for generating lifelike video frames from latent space representations, while the Discriminator guides adversarial training dynamics. By employing an encoder-decoder architecture augmented by TCN layers, our model adeptly captures both spatial and temporal information inherent in video sequences. Through extensive experiments conducted on benchmark datasets like KTH Action, we demonstrate the competitive performance of our model. Evaluation metrics including Mean Squared Error (MSE) and Structural Similarity Index (SSIM) highlight the superior accuracy of our approach compared to existing models such as FutureGAN, fRNN, and MCNet.

External IDs:dblp:conf/icccnt/SheteGWSDN24