LaViPlan : Language-Guided Visual Path Planning with RLVR

Published: 18 Oct 2025, Last Modified: 29 Jan 2026Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) WorkshopsEveryoneRevisionsCC BY 4.0
Abstract: Out-of-distribution (OOD) scenarios in autonomous driving pose critical challenges, as planners often fail to generalize beyond their training experience, leading to unsafe or unexpected behavior. Vision-Language Mod- els (VLMs) have shown promise in handling such scenar- ios by providing high-level scene understanding and user- aligned decisions. However, existing VLMs often exhibit a misalignment between their language-based reasoning and the low-level trajectories required for action-level plan- ning. In this paper, we propose LaViPlan, a framework that leverages Reinforcement Learning with Verifiable Re- wards (RLVR) to fine-tune VLMs using planning-oriented metrics. Experimental results show that LaViPlan improves planning performance across both in-domain and out-of- domain datasets. While linguistic fidelity slightly decreases after RLVR-based fine-tuning, qualitative evaluation indi- cates that the outputs remain coherent. We also conduct ab- lation studies to analyze the effects of sampling ratio and reasoning guidance, highlighting how these design choices influence performance. These findings demonstrate the po- tential of RLVR as a post-training paradigm for aligning language-guided reasoning with action-level planning in autonomous driving.
Loading