Vector Quantized Diffusion Models for Multiple Appropriate Reactions Generation

Minh-Duc Nguyen; Hyung-Jeong Yang; Ngoc-Huynh Ho; Soo-Hyung Kim; Seungwon Kim; Ji-Eun Shin

Vector Quantized Diffusion Models for Multiple Appropriate Reactions Generation

Minh-Duc Nguyen, Hyung-Jeong Yang, Ngoc-Huynh Ho, Soo-Hyung Kim, Seungwon Kim, Ji-Eun Shin

Published: 01 Jan 2024, Last Modified: 13 Nov 2024FG 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the realm of dyadic interactions, the ability to generate appropriate facial reactions is paramount for the conveyance of empathy and understanding. This paper introduces a novel framework that leverages the strengths of a diffusion model architecture, underpinned by a vector quantized variational autoencoder (VQ-VAE) to synthesize facial reactions that are contextually apt. We rigorously evaluate our model on the IEEE FG REACT2024 dataset, where it demonstrates superior performance, outshining baseline methods in terms of effectiveness. The results underscore the potential of our framework to enhance the fidelity of digital human interactions, paving the way for more nuanced and emotionally intelligent systems.

Loading