ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving

Published: 08 Apr 2026, Last Modified: 08 Apr 2026CVPR 2026 Workshop WDFM-EAI PosterEveryoneRevisionsCC BY 4.0
Keywords: Language-Conditioned Autonomous Driving, Vision-Language-Action Models, Counterfactual Testing, Safety-Critical AI
Abstract: Recent progress in vision-language-action (VLA) models has enabled language-conditioned driving agents to execute natural-language navigation commands in closed-loop simulation, yet standard evaluations largely assume instructions are precise and well-formed. In deployment, instructions vary in phrasing and specificity, may omit critical qualifiers, and can occasionally include misleading, authority-framed text, leaving instruction-level robustness under-measured. We introduce \textbf{ICR-Drive}, a diagnostic framework for \textbf{instruction counterfactual robustness} in end-to-end language-conditioned autonomous driving. ICR-Drive generates controlled instruction variants spanning four perturbation families: \textit{Paraphrase}, \textit{Ambiguity}, \textit{Noise}, and \textit{Misleading}, where Misleading variants conflict with the navigation goal and attempt to override intent. We replay identical CARLA routes under matched simulator configurations and seeds to isolate performance changes attributable to instruction language. Robustness is quantified using standard CARLA Leaderboard metrics and per-family performance degradation relative to the baseline instruction. Experiments on LMDrive and BEVDriver show that minor instruction changes can induce substantial performance drops and distinct failure modes, revealing a reliability gap for deploying embodied foundation models in safety-critical driving.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 5
Loading