Keywords: explainability, social signal processing, multimodality
Abstract: Enabling robots to display the reasoning behind decisions requires them to detect when explanations are needed by users. A crucial driver of explanation need is that it often manifests implicitly: users exhibit behavioural signals indicating misalignment well before they explicitly request an explanation. Psychological studies show that in human interactions, such needs are sensed through multimodal cues and addressed through the co-construction of explanations in real time. Building on this, we introduce an approach for the early detection of explanation needs in HRI. Our method recognises when an explanation is likely to become necessary, enabling robots to act proactively. We evaluate the approach on an existing HRI dataset using features describing facial expressions, body movement, and vocal behaviour, combined with time-series classification techniques. Our results show that different classes of learning algorithms (unsupervised anomaly-based methods and supervised classification models) offer complementary strengths for detecting explanation needs. In particular, unsupervised methods enable early warning signals when labels are unavailable, while supervised models provide stronger discrimination (AUROC 0.7) when annotated data is available. We discuss the implications of these findings for the development of explanation-capable robots and outline future directions for proactive explanations in HRI.
Submission Number: 11
Loading