SelfEvoWM:Self-Evolving Task Discovery and In-Imagination Robot Learning with DROID-Grounded World Models

Published: 03 Mar 2026, Last Modified: 09 Mar 2026ICLR 2026 Workshop MemAgentsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Self-evolving task discovery, Controllable generative world models, Simulation chain integration, Targeted simulation environment construction
TL;DR: We present SelfEvoWM, a DROID-grounded generate–verify–repair loop that uses a controllable world model and VLM-based failure localization to build targeted simulation episodes and improve physical consistency for self-evolving robot task learning.
Abstract: Controllable generative world models make it possible to iterate on manipulation behaviors without repeatedly running real robots. In practice, the bottleneck is often not only ``how to generate'' but how to keep an automated loop tethered to the same assumptions that make classical simulation pipelines reliable: realistic initial states, consistent task semantics, and careful data hygiene. We present SelfEvoWM, a generate--verify--repair loop built on Ctrl-World that explicitly interfaces with a simulation chain. The loop (i) grounds goal proposals by retrieving DROID anchors as simulation-ready initial states, (ii) uses a VLM critic to audit physical consistency and localize failure modes, and (iii) automatically constructs targeted simulation environments to generate supplemental data that repairs weak regions of the world model. Rather than claiming a finished benchmark, this workshop paper focuses on the system design and early failure modes we observed when wiring a world model into an automated simulation pipeline, including retrieval collapse, contact-level artifacts, and sensitivity of VLM judgments to phrasing and viewpoints. We hope this provides a concrete starting point for integrating generative world models with end-to-end simulation stacks for robot learning.
Submission Number: 102
Loading