Keywords: Gesture Generation, Audio-driven Pose Estimation
TL;DR: Given speech audio and text transcriptions, GestureMaster can automatically generate a high-quality gesture sequence.
Abstract: This paper describes the GestureMaster entry to the GENEA (Generation and Evaluation of Non-verbal Behaviour for Embodied
Agents) Challenge 2022. Given speech audio and text transcriptions, GestureMaster can automatically generate a high-quality gesture
sequence to accompany the input audio and text transcriptions in terms of style and rhythm. GestureMaster system is based on the
recent ChoreoMaster publication. ChoreoMaster can generate dance motion given a piece of music. We make some adjustments
to ChoreoMaster to suit for the speech-driven gesture generation task. We are pleased to see that among the participating systems,
our entry attained the highest median score in the human-likeness evaluation. In the appropriateness evaluation, we ranked first in
upper-body study and second in full-body study.
4 Replies
Loading