A Methodology for Evaluating Multimodal Referring Expression Generation for Embodied Virtual Agents

Nada Alalyani; Nikhil Krishnaswamy

A Methodology for Evaluating Multimodal Referring Expression Generation for Embodied Virtual Agents

Nada Alalyani, Nikhil Krishnaswamy

Published: 11 Aug 2023, Last Modified: 30 Oct 2023GENEA Workshop 2023Readers: Everyone

Keywords: Embodied agents, non-verbal behaviours, multimodality, referring expression generation

Abstract: Robust use of definite descriptions in a situated space often involves recourse to both verbal and non-verbal modalities. For IVAs, virtual agents designed to interact with humans, the ability to both recognize and generate non-verbal and verbal behavior is a critical capability. To assess how well an IVA is able to deploy multimodal behaviors, including language, gesture, and facial expressions, we propose a methodology to evaluate the agent's capacity to generate object references in a situational context, using the domain of multimodal referring expressions as a use case. Our contributions include: 1) developing an embodied platform to collect human referring expressions while communicating with the IVA. 2) comparing human and machine-generated references in terms of evaluable properties using subjective and objective metrics. 3) reporting preliminary results from trials that aimed to check whether the agent can retrieve and disambiguate the object the human referred to, if the human has the ability to correct misunderstanding using language, deictic gesture, or both; and human ease of use while interacting with the agent.

Paper Type: Long

3 Replies

Loading