RADI: LLMs as World Models for Robotic Action Decomposition and Imagination

Published: 06 Mar 2025, Last Modified: 15 Apr 2025ICLR 2025 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Robot task planning, Large language model, world model
Abstract: Robotics is irreplaceable in driving social progress, enhancing productivity and improving human life, and efficient task planning is the key to ensuring that robots accurately perform complex tasks. Traditional world models based on physical simulation or rule-based engines are limited by the high cost of environment modeling and dynamic scene generalization capabilities. Although large language models (LLMs), represented by GPT, have shown potential for generalized intelligence in natural language processing tasks and have made initial progress in robotic task planning, their generalization ability as a world model for the robotics domain has not been systematically verified. No study has yet answered the question of whether LLMs can predict physical action outcomes through task decomposition and environment imagery (rather than pure linguistic reasoning), and how to assess their world modeling capabilities. In this paper, we propose the \textbf{R}obotic \textbf{A}ction \textbf{D}ecomposition and \textbf{I}magination (RADI) framework, which combines the self-reflective capability of LLMs to improve the success rate of task planning through the two core mechanisms of action decomposition and environment imagination. Specifically, RADI first gradually decomposes a complex robot task into atomic action sequences, then imagines the execution results of each action based on the environment state, and verifies whether it meets the task expectations through the state changes. If the expectations are not met, it triggers the self-reflective mechanism to re-optimize the action decomposition. The experiments are conducted based on GPT-4 in the VirtualHome environment, and the results show that RADI significantly improves the success rate of task planning, and verifies the effectiveness of LLM as a world model in robotics.
Submission Number: 87
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview