Keywords: explainability, failure detection, multimodality
Domains: Robotics
External Link: https://ieeexplore.ieee.org/document/10974079
Abstract: As black-box AI systems become increasingly complex, understanding when and how to provide explanations to users is crucial. Multimodal signals, such as facial expressions, offer novel insights into how frequently explanations should be given. This paper explores whether users’ facial features can help estimate the need for explanations in a collaborative robot task. We applied three state-of-the-art eXplainable AI (XAI) methods, addressing how, why, and what-if questions, explaining the robot's failure detection model. Each explanation type conveyed information differently: how-explanations described how the model functions, why-explanations prowided personalised insights into input-feature-related cues, and what-if-explanations explored alternative scenarios. In a mixed-design study (N=33), participants performed a robot-assisted pick-and-place task, receiving different explanation types. Our results show that users responded differently to these explanations, with why-explanations being the most preferred and prompting closer alignment in facial expressions with the robot Contrary to expectations, what-if explanations led to the least alignment and required greater vocal effort. These findings demonstrate how non-verbal cues can guide the frequency and type of explanations (personalised or general) and further highlight the importance of model transparency in human-robot collaboration.
Submission Number: 179
Loading