VLA-Risk: Benchmarking Vision-Language-Action Models with Physical Robustness

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: vision-language-action models, topographic attack, robustness evaluation
Abstract: Vision-Language-Action (VLA) models have recently demonstrated impressive capabilities in unifying visual perception, natural language understanding, and physical action execution. Despite these advances, they introduce new attack surfaces and vulnerabilities across both instruction execution and visual understanding. While several studies have begun to investigate such weaknesses via adversarial attacks, the field still lacks a unified benchmark to systematically evaluate risk accross different modalities. To address this gap, we present VLA-Risk, a benchmark for assessing the risks of VLA models across different input modalities (e.g. image and instruction) and along three fundamental task dimensions: object, action, and space. VLA-Risk spans 296 scenarios and 3784 episodes, covering diverse settings such as simple manipulation, semantic reasoning, and autonomous driving. By structuring attacks around these dimensions, VLA-Risk provides a principled framework for analyzing vulnerabilities and guiding the development of safer and more robust embodied agents. Extensive empirical evaluation further shows that the current state-of-the-art VLA models face substantial challenges under our attack tasks.
Primary Area: datasets and benchmarks
Submission Number: 23105
Loading