AGENTCON: PRACTICAL ATTACKS ON GENERALIST WEB AGENTS VIA IMPERCEPTIBLE MANIPULATION

AGENTCON: PRACTICAL ATTACKS ON GENERALIST WEB AGENTS VIA IMPERCEPTIBLE MANIPULATION

ICLR 2026 Conference Submission13176 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Privacy, Safety, VLM web agents, Adversarial examples, Web agents, Vision-Language Model

Abstract: Recent progress in generalist web agents built on large multimodal models has enabled automation of complex web tasks but also created new security risks. We identify a new attack vector against web agents that does not require manipulating HTML elements, unlike prior work. Our threat model focuses on marketplace websites, a primary target of generalist web agents, where users and sellers can upload images themselves. We propose AGENTCON, a practical attack that crafts adversarial perturbations on listing images, rather than perturbing the entire input as in traditional adversarial attacks, to induce the intended target action by web agents. AGENTCON incorporates real-world constraints from webpage rendering into the optimization so that the attack remains effective when neighboring listings and the attack image’s position vary. Our evaluation on 1,680 tasks against a state-of-the-art web agent framework demonstrates the effectiveness of AGENTCON, with an attack success rate (ASR) of 80.4% on average across four application scenarios and three agent models. AGENTCON is also resilient to common countermeasures, achieving an ASR of 76% on average.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 13176

Loading