Keywords: World Model, Hierarchical Planning, Reinforcement Learning, Autonomous Driving, Communications
TL;DR: We propose a hierarchical world model framework that mimics human driving - planning and communicating higher-level driving actions. We devise AdaSMO to address multi-objective optimization of hierarchical training.
Abstract: World-model (WM) is a highly promising approach for training AI agents. However, in complex learning systems such as autonomous driving, AI agents interact with others in a dynamic environment and face significant challenges such as partial observability and non-stationarity. Inspired by how humans naturally solve complex tasks hierarchically and how drivers share their intentions by using turn signals, we introduce HANSOME, a WM-based hierarchical planning with semantic communications framework. In HANSOME, semantic information, particularly text and compressed visual data, is generated and shared to improve two-level planning. HANSOME incorporates two important designs: 1) A hierarchical planning strategy, where the higher-level policy generates intentions with text semantics, and a semantic alignment technique ensures the lower-level policy determines specific controls to achieve these intentions. 2) A cross-modal encoder-decoder to fuse and utilize the shared semantic information to enhance planning through multi-modal understanding. A key advantage of HANSOME is that the generated intentions not only enhance the lower-level policy but also can be shared and understood by humans or other AVs to improve their planning. Furthermore, we devise AdaSMO, an entropy-controlled adaptive scalarization method, to tackle the multi-objective optimization problem in hierarchical policy learning. Extensive experiments show that HANSOME outperforms state-of-the-art WM-based methods in challenging driving tasks, enhancing overall traffic safety and efficiency.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13103
Loading