Adaptive Human–AI Coordination via Hierarchical Action Disentanglement

Adaptive Human–AI Coordination via Hierarchical Action Disentanglement

TMLR Paper8973 Authors

16 May 2026 (modified: 26 May 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Human–AI collaboration requires intelligent agents that can rapidly adapt their strategies to diverse partner styles and skill levels, while remaining capable of coordinating with previously unseen partners. Existing deep hierarchical reinforcement learning (DHRL) approaches often collapse to a single behavior or produce diverse behaviors that do not align with partner dynamics, leading to suboptimal coordination. To address these challenges, we introduce Intrinsic Action Disentanglement (IAD), a DHRL-based approach that trains agents to discover distinct low-level action sequences corresponding to different partner behaviors. IAD achieves this through a novel intrinsic reward that encourages the low-level policy to produce disentangled action distributions conditioned on high-level latent skills. This design ensures that each high-level skill is mapped to a distinct, partner-aware response, enabling agents to flexibly adapt to partners with varying skill levels and coordination styles while maintaining robust coordination with previously unseen partners. We evaluate IAD extensively in the collaborative Overcooked-AI environment across multiple layouts, each presenting unique coordination challenges. Agents are tested with large, unseen populations of partners characterized by varying skill levels and behavioral styles, as well as a human-proxy model trained from human–human gameplay data. Analyses of skill usage reveal that IAD effectively utilizes its full set of skills, dynamically switching between them to adapt to diverse partner behaviors. Across all settings, IAD consistently outperforms baseline methods, achieving higher returns and robust coordination.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Oleg_Arenz1

Submission Number: 8973

Loading