Incorporating Spatial Awareness in Data-Driven Gesture Generation for Virtual Agents

Published: 01 Jan 2024, Last Modified: 12 Apr 2025CoRR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper focuses on enhancing human-agent communication by integrating spatial context into virtual agents' non-verbal behaviors, specifically gestures. Recent advances in co-speech gesture generation have primarily utilized data-driven methods, which create natural motion but limit the scope of gestures to those performed in a void. Our work aims to extend these methods by enabling generative models to incorporate scene information into speech-driven gesture synthesis. We introduce a novel synthetic gesture dataset tailored for this purpose. This development represents a critical step toward creating embodied conversational agents that interact more naturally with their environment and users.
Loading