Where Do We Look When We Teach? Analyzing Human Gaze Behavior Across Demonstration Devices in Robot Imitation Learning

Yutaro Ishida; Takamitsu Matsubara; Takayuki Kanai; Kazuhiro Shintani; Hiroshi Bito

Where Do We Look When We Teach? Analyzing Human Gaze Behavior Across Demonstration Devices in Robot Imitation Learning

Yutaro Ishida, Takamitsu Matsubara, Takayuki Kanai, Kazuhiro Shintani, Hiroshi Bito

Published: 17 Sept 2025, Last Modified: 17 Sept 2025H2R CoRL 2025 WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Gaze Behavior, Demonstration Devices, Imitation Learning

TL;DR: Robot-emulating demonstration devices in imitation learning impair the demonstrator's gaze-based task-relevant cue extraction, and using gaze behavior collected by devices that capture natural human behavior can improve the policy's robustness.

Abstract: Imitation learning for acquiring generalizable policies often requires a large volume of demonstration data, making the process significantly costly. One promising strategy to address this challenge is to leverage the cognitive and decision-making skills of human demonstrators with strong generalization capability, particularly by extracting task-relevant cues from their gaze behavior. However, imitation learning typically involves humans collecting data using demonstration devices that emulate a robot's embodiment and visual condition. This raises the question of how such devices influence gaze behavior. We propose an experimental framework that systematically analyzes demonstrators' gaze behavior across a spectrum of demonstration devices. Our experimental results indicate that devices emulating (1) a robot's embodiment or (2) visual condition impair demonstrators' capability to extract task-relevant cues via gaze behavior, with the extent of impairment depending on the degree of emulation. Additionally, our proof-of-concept experiments reveal that gaze data collected using devices that capture natural human behavior improves the task success rate of imitation learning policies from 18.8\% to 68.8\% under environmental shifts.

Submission Number: 9

Loading