Keywords: Backdoor Attack, AI safety, Vision Large Language Models, Autonomous Driving
Abstract: Vision-Large-Language-models (VLMs) have great application prospects in autonomous driving.
Despite the ability of VLMs to comprehend and make decisions in complex scenarios, their integration into safety-critical autonomous driving systems poses serious safety risks.
In this paper, we propose \texttt{BadVLMDriver}, the first backdoor attack against VLMs for autonomous driving that can be launched in practice using \textit{physical} objects.
\texttt{BadVLMDriver} uses common physical items, such as a red balloon, to induce unsafe actions like sudden acceleration, highlighting a significant real-world threat to autonomous vehicle safety.
To execute \texttt{BadVLMDriver}, we develop an automated and efficient pipeline utilizing natural language instructions to generate backdoor training samples with embedded malicious behaviors, without the need for retraining the model on a poisoned benign dataset.
We conduct extensive experiments to evaluate \texttt{BadVLMDriver} for two representative VLMs, five different trigger objects, and two types of malicious backdoor behaviors.
\texttt{BadVLMDriver} achieves a 92\% attack success rate in inducing a sudden acceleration when coming across a pedestrian holding a red balloon.
Submission Number: 9
Loading