LAP: Fast $\textbf{LA}$tent Diffusion $\textbf{P}$lanner with Fine-Grained Feature Distillation for Autonomous Driving

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: diffusion planning, autonomous driving
TL;DR: We propose a latent diffusion planning framework for autonomous driving.
Abstract: Diffusion models have demonstrated strong capabilities for modeling human-like driving behaviors in autonomous driving, but their iterative sampling process induces substantial latency, and operating directly on raw trajectory points forces the model to spend capacity on low‑level kinematics, rather than high‑level multi-modal semantics. To address these limitations, we propose $\textbf{LA}$tent $\textbf{P}$lanner (LAP), a framework that plans in a VAE-learned latent space that disentangles high-level intents from low-level kinematics, enabling our planner to capture rich, multi-modal driving strategies. We further introduce a fine-grained feature distillation mechanism to guide a better interaction and fusion between the high-level semantic planning space and the vectorized scene context. Notably, LAP can produce high-quality plans in $\textbf{one single denoising step}$, substantially reducing computational overhead. Through extensive evaluations on the large-scale nuPlan benchmark, LAP achieves $\textbf{state-of-the-art}$ closed-loop performance among learning-based planning methods, while demonstrating an inference speed-up of at most $\mathbf{10\times}$ over previous SOTA approaches. Project website: https://anonymous.4open.science/w/Latent-Planner/.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 8441
Loading