Keywords: end-to-end autonomous driving, closed-loop evaluation
Abstract: We introduce ReGen4AD, an interactive and controllable retrieval based online video generation pipeline for closed-loop reactive driving evaluation. Unlike existing video generative models for AD which generate multiple frames all at once, the proposed designs are tailored for interactive simulation, where sensor rendering and behavior rollout are decoupled by applying a separate behavioral controller to simulate the reactions of surrounding agents. As a result, the generative model could focus on image fidelity, control adherence, and spatial-temporal coherence. For temporal consistency, due to the stepwise interaction nature of simulation, we design a noise modulating temporal encoder with Gaussian blurring to encourage long-horizon autoregressive rollout of image sequences without deteriorating distribution shifts. For spatial consistency, a retrieval mechanism, which takes the spatially nearest images as references, is introduced to ensure scene-level rendering fidelity. The spatial relations between target and reference are explicitly modeled with 3D relative position encodings. The potential over-reliance of reference images is mitigated with hierarchical sampling and classifier-free guidance. We compare the generation quality with existing AD generative models and show its superiority in the online driving setting. We further integrate it into nuPlan and evaluate the generative qualities with closed-loop simulation results.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 5857
Loading