RoboGPT : An intelligent agent of making embodied long-term decisions for daily instruction tasks

Yaran Chen; Wenbo Cui; Yuanwen Chen; Mining Tan; Xinyao Zhang; Dongbin Zhao; He Wang

RoboGPT : An intelligent agent of making embodied long-term decisions for daily instruction tasks

Yaran Chen, Wenbo Cui, Yuanwen Chen, Mining Tan, Xinyao Zhang, Dongbin Zhao, He Wang

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Robot task planning, Daily tasks following by instructions, Embodied AI

TL;DR: RoboGPT agent solves daily insturction task with long-term decisions through LLM planning and low-level policy

Abstract: Robotic agents must master common sense and long-term sequential decisions to solve daily tasks through natural language instruction. The developments in Large Language Models (LLMs) in natural language processing have inspired efforts to use LLMs in complex robot planning. Despite LLMs' great generalization and comprehension of instructional tasks, LLM-generated task plans sometimes lack feasibility and correctness. To address the problem, we propose a RoboGPT agent for making embodied long-term decisions for daily tasks, with two modules: 1) LLM-based planning with Re-Plan to break the task into multiple sub-goals; 2) RoboSkill individually designed for sub-goals to learn better navigation and manipulation skills. The LLM-based planning is enhanced with a new robotic dataset and re-plan, called RoboGPT. The new robotic dataset of 67k daily instruction tasks is gathered for fine-tuning the LLaMA model and obtaining RoboGPT. RoboGPT palnner with strong generalization can plan hundreds of daily tasks, and re-plan based on the environment, thereby addressing the nomenclature diversity challenge. Additionally, a low-computational Re-Plan module is designed to allow plans to flexibly adapt to the environment. The proposed RoboGPT agent outperforms SOTA methods on the ALFRED daily tasks. Moreover, RoboGPT palnner exceeds SOTA LLM-based planners like ChatGPT in task-planning rationality for hundreds of unseen daily tasks, and even other domain tasks, while keeping the large model's original broad application and generality.

Supplementary Material: zip

Primary Area: applications to robotics, autonomy, planning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6784

Loading