Keywords: Embodied Intelligence, Minecraft, Large Language Model, Large Auto-regressive Model
TL;DR: We introduce the conceptof large auto-regressive model and study how to use it to handle embodied AI tasks.
Abstract: Due to the need of interacting with the world, embodied agents are required to possess comprehensive task-relevant knowledge, long-horizon planning capability, and a swift response speed. Large language models (LLMs), owing to their rich general knowledge, recently achieve promising results in open-world embodied tasks, like the world exploration in Minecraft. However, the outputs of LLMs are descriptive sentences or code, which are slow to generate and not end-to-end, as a translator is required to translate the LLM outputs into actions to perform. To address these limitations, we introduce the large auto-regressive model (LARM). LARM leverages environment observations as input and predicts subsequent actions in an auto-regressive manner. Compared with LLM based methods, LARM directly predicts the next skill for execution according to the current observation. In addition, considering that the commonly adopted training paradigms do not reflect the mutual influence and dependency between actions and observations, we develop a novel data format named auto-regressive node transmission structure and assemble a corresponding dataset to train LARM. Combining these techniques, LARM successfully harvests enchanted equipment in Minecraft, which demands significantly more complex decision-making chains than the highest achievements of prior best methods. Besides, the speed of LARM is 6.8x faster than LLMs with similar parameter volume.
Primary Area: applications to robotics, autonomy, planning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Submission Number: 6112
Loading