RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning
Keywords: Large Language Models, Reinforcement Learning, Embodied AI, Constrained Hardware
TL;DR: This paper extends R1-zero to embodied AI by training small LLMs via closed-loop RL for autonomous driving, enabling edge deployment with reasoning and adaptability once limited to significantly larger models.
Abstract: Future robotic systems operating in real-world environments require
on-board embodied intelligence without continuous cloud connection, balancing
capabilities with constraints on computational power and memory. This work
presents an extension of the R1-zero approach, which enables the usage of small
parameter-count Large Language Models (LLMs) in the robotic domain. The
R1-Zero approach was originally developed to enable mathematical reasoning in
LLMs using static datasets. We extend it to the robotics domain through integration with a closed-loop Reinforcement Learning (RL) framework. This extension
allows reasoning in Embodied Artificial Intelligence (EmbodiedAI) settings without relying solely on distillation of large models through Supervised Fine-Tuning
(SFT). We show that small-scale LLMs can achieve effective reasoning performance by learning through closed-loop interaction with their environment, which
enables tasks that previously required significantly larger models. A performance
gain of 20.2% points over the SFT-based baseline is observed with a Qwen2.5-1.5B
model. Using the proposed training procedure, Qwen2.5-3B achieves a 63.3%
control adaptability score, surpassing the 58.5% obtained by the much larger,
cloud-bound GPT-4o. These results highlight that practical, on-board deployment
of small LLMs is not only feasible but can outperform larger models when trained
through environmental interaction, underscoring the importance of an interactive,
embodied learning framework for robotic EmbodiedAI — one grounded in practical experience rather than static supervision.
Supplementary Material: zip
Spotlight: zip
Submission Number: 325
Loading