SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout

Chiyu Max Jiang; Yijing Bai; Andre Cornman; Christopher Davis; Xiukun Huang; Hong Jeon; Sakshum Kulshrestha; John Wheatley Lambert; Shuangyu Li; Xuanyu Zhou; Carlos Fuertes; Chang Yuan; Mingxing Tan; Yin Zhou; Dragomir Anguelov

SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout

Chiyu Max Jiang, Yijing Bai, Andre Cornman, Christopher Davis, Xiukun Huang, Hong Jeon, Sakshum Kulshrestha, John Wheatley Lambert, Shuangyu Li, Xuanyu Zhou, Carlos Fuertes, Chang Yuan, Mingxing Tan, Yin Zhou, Dragomir Anguelov

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Keywords: Autonomous Vehicles, Simulation, Diffusion Models, Scene Generation, Close Loop Simulation

Abstract: Simulation with realistic and interactive agents represents a key task for autonomous vehicle (AV) software development in order to test AV performance in prescribed, often long-tail scenarios. In this work, we propose SceneDiffuser, a scene-level diffusion prior for traffic simulation. We present a singular framework that unifies two key stages of simulation: scene initialization and scene rollout. Scene initialization refers to generating the initial layout for the traffic in a scene, and scene rollout refers to closed-loop simulation for the behaviors of the agents. While diffusion has been demonstrated to be effective in learning realistic, multimodal agent distributions, two open challenges remain: controllability and closed-loop inference efficiency and realism. To this end, to address controllability challenges, we propose generalized hard constraints, a generalized inference-time constraint mechanism that is simple yet effective. To improve closed-loop inference quality and efficiency, we propose amortized diffusion, a novel diffusion denoising paradigm that amortizes the physical cost of denoising over future simulation rollout steps, reducing the cost of per physical rollout step to a single denoising function evaluation, while dramatically reducing closed-loop errors. We demonstrate the effectiveness of our approach on the Waymo Open Dataset, where we are able to generate distributionally realistic scenes, while obtaining competitive performance in the Sim Agents Challenge, surpassing the state-of-the-art in many realism attributes.

Primary Area: Diffusion based models

Submission Number: 12732

Loading