Abstract: To enable effective human-AI collaboration, optimizing AI performance in isolation is not sufficient. AI systems need to also account for human factors. Prior research shows that incorporating models of human behavior into AI design can improve collaborative performance. However, existing approaches often implicitly assume that human behavior remains fixed regardless of the AI agent’s actions. In practice, humans adapt their behavior based on their beliefs about the AI’s intentions, that is, what they believe the AI is trying to accomplish. In this work, we develop and evaluate collaborative AI agents that account for human beliefs about AI intentions when choosing their actions. We formulate human-AI collaboration as a goal-oriented multi-agent decision-making problem and develop a belief model by extending level-$k$ reasoning with data-driven models of human behavior. Building on this belief model, we first design explicable AI policies that generate behavior from which humans can more easily infer the AI’s intentions, providing a direct test of whether the model captures human belief formation. We then incorporate the belief model into the training of collaborative AI agents to improve coordination with human partners. Through simulations and extensive human-subject experiments, we show that our belief model better captures human inferences about AI intentions and can be used to generate more explicable AI behavior. More importantly, we demonstrate that collaborative AI agents trained with models of human beliefs significantly improve team performance in human-AI collaboration settings. These results demonstrate the value of modeling human beliefs about AI intentions as a design principle for collaborative AI.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: N/A
Assigned Action Editor: ~Huazheng_Wang1
Submission Number: 8833
Loading