SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

ACL ARR 2026 January Submission1116 Authors

28 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large vision-language model, web agent, risk analysis

Abstract: Large vision–language model (LVLM)-based web agents are emerging as powerful automation tools but face severe security risks in real-world deployment. Existing benchmarks offer limited coverage, typically isolating user-level prompts from environmental threats, thus failing to capture the full spectrum of vulnerabilities. To address this, we present SecureWebArena, the first holistic security benchmark for web agents. SecureWebArena features a unified suite of six realistic web environments with 2,970 adversarial trajectories, covering a structured taxonomy of six attack vectors that span both user-level and environment-level manipulations. Crucially, we introduce a multi-layered evaluation protocol that dissects agent failures across internal reasoning, behavioral execution, and task outcomes, enabling fine-grained risk analysis beyond simple success metrics. Experiments on 9 representative LVLMs reveal universal vulnerabilities to subtle manipulations and uncover significant trade-offs between model specialization and security. SecureWebArena establishes a rigorous foundation for advancing the development of trustworthy web agents.

Paper Type: Long

Research Area: AI/LLM Agents

Research Area Keywords: Language Modeling, Dialogue and Interactive Systems

Contribution Types: Data analysis

Languages Studied: English

Submission Number: 1116

Loading