RoboChemist: Long-Horizon and Safety-Compliant Robotic Chemical Experimentation

Published: 08 Aug 2025, Last Modified: 16 Sept 2025CoRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Robotic Chemistry, VLA Models, Visual Prompting
TL;DR: RoboChemist combines VLMs and VLA models in a close-loop framework to safely and reliably automate complex chemical experiments.
Abstract: Robotic chemists promise to both liberate human experts from repetitive tasks and accelerate scientific discovery, yet remain in their infancy. Chemical experiments involve long-horizon procedures over hazardous and deformable substances, where success requires not only task completion but also strict compliance with experimental norms. To address these challenges, we propose RoboChemist, a dual-loop framework that integrates Vision-Language Models (VLMs) with Vision-Language-Action (VLA) models. Unlike prior VLM-based systems (e.g., VoxPoser, ReKep) that rely on depth perception and struggle with transparent labware, and existing VLA systems (e.g., RDT, $\pi_0$) that lack semantic-level feedback for complex tasks, our method leverages a VLM to serve as (1) a planner to decompose tasks into primitive actions, (2) a visual prompt generator to guide VLA models, and (3) a monitor to assess task success and regulatory compliance. Notably, we introduce a VLA interface that accepts image-based visual targets from the VLM, enabling precise, goal-conditioned control. Our system successfully executes both primitive actions and complete multi-step chemistry protocols. Results show significant improvements in both success rate and compliance rate over state-of-the-art VLM and VLA baselines, while also demonstrating strong generalization to objects and tasks. Code, data, and models will be released.
Supplementary Material: zip
Spotlight: zip
Submission Number: 38
Loading