Human-centered Evaluation of Generative Models for Emotional 3D Animation Generation in VR

ICCV 2025 Workshop CV4A11y Submission17 Authors

06 Aug 2025 (modified: 28 Aug 2025)Submitted to CV4A11yEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generative Models, 3D Emotional Animation, Human-centered Evaluation, Virtual Reality, Nonverbal Communication
TL;DR: We evaluate emotional 3D animations from generative models in VR using perceptual metrics and show the need for emotion-aware modeling and human-centric feedback.
Abstract: Facial expressions and body gestures are vital for conveying emotion in social interaction. While generative models can produce speech-synchronized 3D animations, traditional 2D evaluations often miss user-perceived emotional quality. We present a VR-based user study (N=48) evaluating three state-of-the-art speech-driven 3D animation models across two emotions—happiness (high arousal) and neutral (mid arousal)—using user-centric metrics: arousal realism, naturalness, enjoyment, diversity, and interaction quality. We also compare against real human expressions generated via a reconstruction-based method. Models explicitly encoding emotion achieved higher recognition rates than those driven solely by speech. Happy animations were rated significantly more realistic and natural than neutral ones, highlighting challenges in modeling subtle emotion. Generative models underperformed compared to reconstructions in facial expression quality, and all received comparable ratings for enjoyment and interaction quality. Users reliably recognized gesture diversity across generative outputs, motivating integration of perceptual feedback into animation models.
Submission Number: 17
Loading