Leveraging the Latent Diffusion Models for Offline Facial Multiple Appropriate Reactions GenerationDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 05 Nov 2023ACM Multimedia 2023Readers: Everyone
Abstract: Offline Multiple Appropriate Facial Reaction Generation (OMAFRG) aims to predict the reaction of different listeners given a speaker, which is useful in the senario of human-computer interaction and social media analysis. In recent years, the Offline Facial Reactions Generation (OFRG) task has been explored in different ways. However, most studies only focus on the deterministic reaction of the listeners. The research of the non-deterministic (i.e. OMAFRG) always lacks of sufficient attention and the results are far from satisfactory. Compared with the deterministic OFRG tasks, the OMAFRG task is closer to the true circumstance but corresponds to higher difficulty for its requirement of modeling stochasticity and context. In this paper, we propose a new model named FRDiff to tackle this issue. Our model is developed based on the diffusion model architecture with some modification to enhance its ability of aggregating the context features. And the inherent property of stochasticity in diffusion model enables our model to generate multiple reactions. We conduct experiments on the datasets provided by the ACM Multimedia REACT2023 and obtain the second place on the board, which demonstrates the effectiveness of our method.
0 Replies

Loading