Keywords: continual reinforcement learning, benchmark design, task sequences, scene-level tasks, human learning data, plasticity and stability, gameplay patterns
TL;DR: We present GHAIA, a benchmark framework for continual reinforcement learning that aligns human and artificial learning trajectories through structured video game tasks, semantic annotations, and extensive human data.
Abstract: We propose a design for a continual reinforcement learning (CRL) benchmark called GHAIA, centered on human-AI alignment of learning trajectories in structured video game environments. Using \textit{Super Mario Bros.} as a case study, gameplay is decomposed into short, annotated scenes organized into diverse task sequences based on gameplay patterns and difficulty. Evaluation protocols measure both plasticity and stability, with flexible revisit and pacing schedules. A key innovation is the inclusion of high-resolution human gameplay data collected under controlled conditions, enabling direct comparison of human and agent learning. In addition to adapting classical CRL metrics like forgetting and backward transfer, we introduce semantic transfer metrics capturing learning over groups of scenes sharing similar game patterns. We demonstrate the feasibility of our approach on human and agent data, and discuss key aspects of the first release for community input.
Submission Number: 16
Loading