Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis

Published: 05 Mar 2025, Last Modified: 15 Apr 2025BuildingTrustEveryoneRevisionsBibTeXCC BY 4.0
Track: Long Paper Track (up to 9 pages)
Keywords: AI Agent, Web AI Agent, LLM, Jailbreaking
TL;DR: We identify why Web AI agents are more vulnerable than LLMs by dissecting their design, revealing key components that increase risk and offering insights to build safer agents.
Abstract: Recent research has significantly advanced Web AI agents, introducing groundbreaking architectures and benchmarks demonstrating major progress in autonomous web interaction and navigation. However, recent studies have shown that many AI agents can execute malicious tasks and are more vulnerable than standalone LLMs. Our work studies why Web AI agents, built on safety-aligned backbone Large Language Models (LLMs), remain highly susceptible to following malicious user inputs. In particular, we investigate the sources of these vulnerabilities by analyzing the differences between Web AI agents and standalone LLMs in terms of their design and components, quantifying the vulnerability rate introduced by each component. Through a fine-grained evaluation to uncover nuanced jailbreaking signals, we identify three key factors in Web AI agents that make them more vulnerable than standalone LLMs: 1) directly including user input in the system prompt of LLMs, 2) generating actions in a multi-step manner, and 3) processing Event Streams (observation + action history) from web navigation. Furthermore, we observe that many current benchmarks and evlautions rely on mock-up websites, which could potentially lead to misleading results. Our findings highlight the need to prioritize security and robustness when designing the individual components of AI agents. We also suggest developing more realistic safety evaluation systems for Web AI agents.
Submission Number: 98
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview