LaViPlan : Language-Guided Visual Path Planning with RLVR

Hayeon Oh

Published: 18 Oct 2025, Last Modified: 30 Mar 2026Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2025EveryoneRevisionsCC BY 4.0

Abstract: Out-of-distribution (OOD) scenarios in autonomous driving pose critical challenges, as planners often fail to generalize beyond their training experience, leading to unsafe or unexpected behavior. Vision-Language Mod- els (VLMs) have shown promise in handling such scenar- ios by providing high-level scene understanding and user- aligned decisions. However, existing VLMs often exhibit a misalignment between their language-based reasoning and the low-level trajectories required for action-level plan- ning. In this paper, we propose LaViPlan, a framework that leverages Reinforcement Learning with Veriﬁable Re- wards (RLVR) to ﬁne-tune VLMs using planning-oriented metrics. Experimental results show that LaViPlan improves planning performance across both in-domain and out-of- domain datasets. While linguistic ﬁdelity slightly decreases after RLVR-based ﬁne-tuning, qualitative evaluation indi- cates that the outputs remain coherent. We also conduct ab- lation studies to analyze the effects of sampling ratio and reasoning guidance, highlighting how these design choices inﬂuence performance. These ﬁndings demonstrate the po- tential of RLVR as a post-training paradigm for aligning language-guided reasoning with action-level planning in autonomous driving.