Using Large Language Models for Robot-Assisted Therapeutic Role-Play: Factuality is not enough!

Sviatlana Höhn, Jauwairia Nasir, Ali Paikan, Pouyan Ziafati, Elisabeth André

Published: 2024, Last Modified: 21 Oct 2024CUI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Robot-assisted social role-play can help neurodivergent individuals practice social skills in a safe environment. Large language models (LLM) facilitate the implementation of such agents. However, high quality standards must be ensured in this sensitive setting. This article argues that current evaluation methods of generated language are not sufficient because they are grounded in beliefs about language as an external code to describe the world (referential functions of language). We argue that non-referential functions of language must be part of the evaluation of LLM-generated language when LLMs engage in social interactions with users. We test the feasibility of our approach in a pilot implementation of a platform for robot-assisted social role-play. Out proposed evaluation framework helps to assess systematically referential and non-referential functions of LLM-generated language. We argue that the evaluation framework can be also applied to multimodal interaction.