AssistanceZero: Scalably Solving Assistance Games

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We develop an AI assistant that helps users build houses in Minecraft using assistance games, an alternative to reinforcement learning from human feedback (RLHF).
Abstract: Assistance games are a promising alternative to reinforcement learning from human feedback (RLHF) for training AI assistants. Assistance games resolve key drawbacks of RLHF, such as incentives for deceptive behavior, by explicitly modeling the interaction between assistant and user as a two-player game where the assistant cannot observe their shared goal. Despite their potential, assistance games have only been explored in simple settings. Scaling them to more complex environments is difficult because it requires both solving intractable decision-making problems under uncertainty and accurately modeling human users' behavior. We present the first scalable approach to solving assistance games and apply it to a new, challenging Minecraft-based assistance game with over $10^{400}$ possible goals. Our approach, AssistanceZero, extends AlphaZero with a neural network that predicts human actions and rewards, enabling it to plan under uncertainty. We show that AssistanceZero outperforms model-free RL algorithms and imitation learning in the Minecraft-based assistance game. In a human study, our AssistanceZero-trained assistant significantly reduces the number of actions participants take to complete building tasks in Minecraft. Our results suggest that assistance games are a tractable framework for training effective AI assistants in complex environments. Code and videos are available at https://anonymous.4open.science/w/scalably-solving-assistance-games/.
Lay Summary: We built an AI assistant that plays the game Minecraft with you: if you start building a house in Minecraft, it figures out what you’re doing and jumps in to help. Unlike AI assistants like ChatGPT, which are trained via a technique called RLHF, our assistant is trained using a different technique. We use an approach called "assistance games," where our assistant learns to help a simulated user by helping them build many different houses and improving itself via trial and error. This approach encourages the assistant to communicate carefully with the user and to avoid assuming that it knows what their goal is. We compare our assistant developed using assistance games to one developed using similar techniques to ChatGPT and other AI chatbots. We find our assistant is much better at helping real people in Minecraft, allowing them to build houses with less effort.
Link To Code: https://github.com/cassidylaidlaw/minecraft-building-assistance-game
Primary Area: Reinforcement Learning->Multi-agent
Keywords: assistance games, assistants, alignment, multi-agent, interaction, cooperation, coordination, goal inference
Flagged For Ethics Review: true
Submission Number: 7171
Loading