Keywords: Embodied Graph, Rational World Model
Abstract: World models empower embodied exploration by building internal representations and predicting future states of the environment. Nevertheless, existing approaches merely rely on dense supervisions such as pixel-level signals, failing to distinguish task-relevant information from task-irrelevant information during exploration, where task-relevant information is normally sparse. As such, this dense-supervision design suffers from suboptimal exploration due to the noisy task-irrelevant information. To address this issue, in this paper, we propose Embodied Graph, a sparse-supervision design capable of capturing the sparse task-relevant information for embodied exploration, to the best of our knowledge, for the first time. However, the proposed Embodied Graph remains unexplored and imposes three challenges: 1) How to use an embodied graph to model the environment dynamics? 2) How to update the embodied graph in a dynamic environment? 3) How to define and learn graph-grounded actions and policies during explorations? To solve these challenges, we propose to instantiate Embodied Graph as a Relational World Model (RWM) for embodied tasks execution. Specifically, we first design and formalize the embodied graph, incorporating definitions of nodes and edges and extending the concept of interactions. Based on this formulation, the instantiated RWM is able to serve as a hierarchical architecture consisting of high-level RWM and low-level RWM. On the one hand, the high-level RWM includes: (i) an embodied dynamic graph constructor that continually updates nodes and edges based on reachability and frontier discovery; (ii) a graph-guided macro-action generator that nominates exploratory macro-action candidates by jointly balancing exploration gain, operational cost, and potential risks. On the other hand, the low-level RWM integrates a plug-and-play behavioral model that executes the selected macro-actions. Extensive experiments over Minecraft and Atari demonstrate the effectiveness of our proposed RWM model in significantly outperforming the state-of-the-art baseline methods. In particular for Minecraft, among all the comparative approaches, our proposed RWM is the only one capable of achieving the final goal within the given budge.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 19547
Loading