RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning

Liam Boyle; Nicolas Baumann; Paviththiren Sivasothilingam; Michele Magno; Luca Benini

RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning

Liam Boyle, Nicolas Baumann, Paviththiren Sivasothilingam, Michele Magno, Luca Benini

Published: 08 Aug 2025, Last Modified: 16 Sept 2025CoRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Reinforcement Learning, Embodied AI, Constrained Hardware

TL;DR: This paper extends R1-zero to embodied AI by training small LLMs via closed-loop RL for autonomous driving, enabling edge deployment with reasoning and adaptability once limited to significantly larger models.

Abstract: Future robotic systems operating in real-world environments require on-board embodied intelligence without continuous cloud connection, balancing capabilities with constraints on computational power and memory. This work presents an extension of the R1-zero approach, which enables the usage of small parameter-count Large Language Models (LLMs) in the robotic domain. The R1-Zero approach was originally developed to enable mathematical reasoning in LLMs using static datasets. We extend it to the robotics domain through integration with a closed-loop Reinforcement Learning (RL) framework. This extension allows reasoning in Embodied Artificial Intelligence (EmbodiedAI) settings without relying solely on distillation of large models through Supervised Fine-Tuning (SFT). We show that small-scale LLMs can achieve effective reasoning performance by learning through closed-loop interaction with their environment, which enables tasks that previously required significantly larger models. A performance gain of 20.2% points over the SFT-based baseline is observed with a Qwen2.5-1.5B model. Using the proposed training procedure, Qwen2.5-3B achieves a 63.3% control adaptability score, surpassing the 58.5% obtained by the much larger, cloud-bound GPT-4o. These results highlight that practical, on-board deployment of small LLMs is not only feasible but can outperform larger models when trained through environmental interaction, underscoring the importance of an interactive, embodied learning framework for robotic EmbodiedAI — one grounded in practical experience rather than static supervision.

Supplementary Material: zip

Spotlight: zip

Submission Number: 325

Loading