Rethinking Defense for Computer-Use Agents: Context Deception Attacks are Simple to Defend

Pei Yang; Hai Ci; Mike Zheng Shou

Rethinking Defense for Computer-Use Agents: Context Deception Attacks are Simple to Defend

Pei Yang, Hai Ci, Mike Zheng Shou

13 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Computer-use agents

Abstract: Computer-use agents powered by vision-language models (VLMs) have significantly advanced human-computer interaction. However, they remain vulnerable to \textit{context deception attacks}, an emerging threat where adversaries embed misleading content into the agent's operational environment (such as a malicious pop-up window) to hijack agent behavior. As recent benchmarks highlight the severity of these attacks, initial studies have shown that conventional defenses like direct prompting are largely ineffective, fostering a perception that these attacks are a difficult and unsolved challenge. In this paper, we challenge this conclusion, arguing that the perceived difficulty is an artifact of the defense paradigms studied, not an inherent property of the attacks themselves. We introduce in-context defense, a surprisingly simple paradigm that leverages in-context learning to prove our claim. By augmenting the agent’s context with a minimal set of exemplars, we guide it to perform explicit defensive reasoning before action planning, effectively immunizing it from deception. Experiments show this method is remarkably effective, reducing up to 91.2\% of pop-up window attacks and achieving near-perfect defense on some other attacks, a stark contrast to the failures of prior approaches. Our work delivers two critical insights: (1) the problem of context deception is far more tractable than previously believed, and (2) teaching an agent a reasoning process (defense-first analysis), rather than just giving it a rule, is the key to robust defense.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 4688

Loading