H2IL-MBOM: A Hierarchical World Model Integrating Intent and Latent Strategy as Opponent Modeling in Multi-UAV Game
Keywords: Multi-UAV Game, Opponent modeling, World model, Multi-agent Reinforcement Learning
Abstract: In the mixed cooperative-competitive scenario, the uncertain decisions of agents on both sides not only render learning non-stationary but also pose a threat to each other's security. Existing methods either predict policy beliefs based on opponents' interactive actions, goals, and rewards or predict trajectories and intents solely from local historical observations. However, the above private information is unavailable and these methods neglect the underlying dynamics of the environment and relationship between intentions, latent strategies, actions, and trajectories for both sides. To address these challenges, we propose a Hierarchical Interactive Intent-Latent-Strategy-Aware World Model based Opponent Model (H2IL-MBOM) and the Mutual Self-Observed Adversary Reasoning PPO (MSOAR-PPO) to enables both parties to dynamically and interactively predict multiple intentions and latent strategies, along with their trajectories based on self observation. Concretely, the high-level world model fuses related observations regarding opponents and multi-learnable intention queries to anticipate future intentions and trajectories of opponents and incorporate anticipated intentions into the low-level world model to infer how opponents' latent strategies react and their influence on the trajectories of cooperative agents. We validate the effectiveness of the method and demonstrate its superior performance through comparisons with state-of-the-art model-free reinforcement learning and opponent modeling methods in more challenging settings involving multi-agent close-range air-combat environments with missiles.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3710
Loading