Keywords: Language-Conditioned Autonomous Driving, Vision-Language-Action Models, Counterfactual Testing, Safety-Critical AI
Abstract: Recent progress in vision-language-action (VLA) models has enabled language-conditioned driving
agents to execute natural-language navigation commands in closed-loop simulation, yet standard
evaluations largely assume instructions are precise and well-formed. In deployment, instructions
vary in phrasing and specificity, may omit critical qualifiers, and can occasionally include
misleading, authority-framed text, leaving instruction-level robustness under-measured. We
introduce \textbf{ICR-Drive}, a diagnostic framework for \textbf{instruction counterfactual
robustness} in end-to-end language-conditioned autonomous driving. ICR-Drive generates controlled
instruction variants spanning four perturbation families: \textit{Paraphrase},
\textit{Ambiguity}, \textit{Noise}, and \textit{Misleading}, where Misleading variants conflict with the
navigation goal and attempt to override intent. We replay identical CARLA routes under matched
simulator configurations and seeds to isolate performance changes attributable to instruction
language. Robustness is quantified using standard CARLA Leaderboard metrics and per-family
performance degradation relative to the baseline instruction. Experiments on LMDrive and BEVDriver
show that minor instruction changes can induce substantial performance drops and distinct failure
modes, revealing a reliability gap for deploying embodied foundation models in safety-critical
driving.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 5
Loading