Towards the generation of synchronized and believable non-verbal facial behaviors of a talking virtual agent
Keywords: Non-verbal behavior, behavior generation, embodied conversational agent, neural networks, adversarial learning, encoder-decoder
Abstract: This paper introduces a new model to generate rhythmically relevant non-verbal facial behaviors for virtual agents while they speak. The model demonstrates perceived performance comparable to behaviors directly extracted from the data and replayed on a virtual agent, in terms of synchronization with speech and believability. Interestingly, we found that training the model with two different sets of data, instead of one, did not necessarily improve its performance. The expressiveness of the people in the dataset and the shooting conditions are key elements. We also show that employing an adversarial model, in which fabricated fake examples are introduced during the training phase, increases the perception of synchronization with speech. A collection of videos demonstrating the results and code can be accessed at: https://github.com/aldelb/non_verbal_facial_animation.
Paper Type: Long
Self-nominate For A Reproducibility Award: Yes
3 Replies
Loading