Can You See How I learn? Human Observers' Inferences about Reinforcement Learning Agents' Learning Processes

Published: 01 Apr 2025, Last Modified: 02 May 2025ALAEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Human-in-the-loop RL, Human Agent-Interaction, Human Robot-Interaction, Hybrid Intelligence
TL;DR: This work systematically examines and taxonomizes human observer perception and interpretation of RL agents' learning processes
Abstract: Reinforcement Learning (RL) agents often exhibit learning behaviors that are not intuitively interpretable by human observers, which can result in suboptimal feedback in collaborative teaching settings. Yet, how humans perceive and interpret RL agent's learning behavior is largely unknown. In a bottom-up approach with two experiments, this work provides a data-driven understanding of the factors of human observers' understanding of the agent's learning process. A novel, observation-based paradigm to directly assess human inferences about agent learning was developed. In an exploratory interview study (N=9), we identify four core themes in human interpretations: Agent Goals, Knowledge, Decision Making, and Learning Mechanisms. A second confirmatory study (N=34) applied an expanded version of the paradigm across two tasks (navigation/manipulation) and two RL algorithms (tabular/function approximation). Analyses of 816 responses confirmed the reliability of the paradigm and refined the thematic framework, revealing how these themes evolve over time and interrelate. Our findings provide a human-centered understanding of how people make sense of agent learning, offering actionable insights for designing interpretable RL systems and improving transparency in Human-Robot Interaction.
Type Of Paper: Full paper (max page 8)
Anonymous Submission: Anonymized submission.
Submission Number: 22
Loading