The KU-ISPL entry to the GENEA Challenge 2023-A Diffusion Model for Co-speech Gesture generationDownload PDF

Published: 04 Sept 2023, Last Modified: 30 Oct 2023GENEA Challenge 2023 WorkshopproceedingReaders: Everyone
Keywords: GENEA Challenge, co-speech gesture generation, diffusion, neural networks, generative models
Abstract: This paper describes a diffusion model for co-speech gesture generation presented by KU-ISPL entry of the GENEA Challenge 2023. We formulate the gesture generation problem as a co-speech gesture generation problem and a semantic gesture generation problem, and we focus on solving the co-speech gesture generation problem by denoising diffusion probabilistic model with text, audio, and pre-pose conditions. We use the U-Net with cross-attention architecture as a denoising model, and we propose a gesture autoencoder as a mapping function from the gesture domain to the latent domain. The collective evaluation released by GENEA Challenge 2023 shows that our model successfully generates co-speech gestures. Our system receives a mean human-likeness score of 32.0, a preference-matched score of appropriateness for the main agent speech of 53.6%, and an interlocutor speech appropriateness score of 53.5%.We also conduct an ablation study to measure the effects of the pre-pose. By the results, our system contributes to the co-speech gesture generation for natural interaction.
3 Replies

Loading