UAOR: Uncertainty-aware Observation Reinjection for Vision-Language-Action Models

03 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Vision-Language-Action, Uncertainty-Aware, Observation Reinjection
TL;DR: An effcient training-free module to augment Vision-Language-Action Models by Uncertainty-aware Observation Reinjection
Abstract: Vision–Language–Action (VLA) models leverage pretrained Vision–Language Models (VLMs) as backbones to map images and instructions to actions, demonstrating remarkable potential for generalizable robotic manipulation. To improve performance, many methods have been proposed to incorporate additional observation cues (e.g., depth maps, point clouds) and auxiliary modules (e.g., object detectors, encoders), enabling more precise and reliable task execution. Although effective, these approaches often require extensive data collection and additional training or fine-tuning, limiting their flexibility and scalability. Inspired by the finding that Feed-Forward Network (FFN) in language models can act as "key-value memory'', we propose **U**ncertainty-**a**ware **O**bservation **R**einjection (**UAOR**), an effective training-free and plug-and-play module for VLA models. Specially, when the current language model layer exhibits high uncertainty, measured by **Action Entropy**, it reinjects the observation information into the next layer's Feed-Forward Network (FFN) in a blending manner. This mechanism helps VLA models look more clearly on the observation during inference, enabling more confident and faithful action generation. Comprehensive simulation and real-world experiments show that our method consistently improves the performance of heterogeneous VLA models across various tasks and embodiments while incurring minimal computational overhead. Notably, **UAOR** eliminates the need for extra observation cuse or modules, making it a versatile and practical plug-in for existing VLA pipelines.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 1630
Loading