Adaptive Safety Probing for Resource-Efficient Vision-Language-Action Models

Published: 01 Jun 2026, Last Modified: 10 Jun 2026AdaptFM PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: foundation models, adaptive runtime, VLA, robotics, safety, security
TL;DR: VLA models already encode where to act and what to attend to—by probing their internal representations and triggering safeguards only when needed, we enable real-time, low-cost safety monitoring for robotic foundation models.
Abstract: Foundation models (FM) are rapidly moving from cloud-centered language and vision applications into embodied robotic systems, where they must support real-time decision making under strict constraints. Vision-Language-Action (VLA) models are a prominent class of such embodied FMs and are increasingly being used as the main controller of robotic systems. While this integration enables general-purpose robotic behavior, it also introduces safety and security risks. A compromised instruction or unsafe generated action can propagate directly to the robot's actuators. Existing defenses often rely on auxiliary perception models, repeated inference, or always-on safety filters, which are poorly matched to resource-constrained robotic platforms. We propose an adaptive inference-time safety framework for VLA models that reuses signals already computed inside the frozen policy. First, we show that intermediate hidden states encode task-relevant 3D geometric intent: a lightweight probe trained offline on successful rollouts predicts the intended grasp target and can be used as a low-cost failure monitor. Second, we reconstruct targeted action-to-vision attention maps from the VLA backbone to identify image regions driving the predicted action, avoiding the need for a separate heavyweight vision-language model. Third, we introduce Post-processing Trigger-based Purification, a check-then-defend mechanism that invokes more expensive safety interventions only when lightweight monitors detect a policy violation, target mismatch, or uncertain execution. Together, these components provide a resource-aware adaptive safety and security layer for emerging robotic foundation-model systems. Our simulation results show up to 40\% runtime improvement compared to a non-adaptive safe FM.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 81
Loading