Hijacking JARVIS: Benchmarking Mobile GUI Agents against Unprivileged Third Parties

Guohong Liu; Jialei Ye; Jiacheng Liu; Yuanchun Li; Wei Liu; Pengzhi Gao; Jian Luan; Yunxin Liu

Hijacking JARVIS: Benchmarking Mobile GUI Agents against Unprivileged Third Parties

Guohong Liu, Jialei Ye, Jiacheng Liu, Yuanchun Li, Wei Liu, Pengzhi Gao, Jian Luan, Yunxin Liu

15 Sept 2025 (modified: 17 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Mobile GUI Agents, UI Security, Adversarial Attacks, AgentHazard, Empirical Evaluation

TL;DR: This study presents the first systematic investigation of mobile GUI agents' vulnerabilities to on-screen content manipulated by untrustworthy third parties.

Abstract: GUI agents are designed to autonomously execute diverse device-control tasks by interpreting and interacting with device screens. Despite notable advancements, their resilience in real-world scenarios—where screen content may be partially manipulated by untrustworthy third parties—remains largely unexplored. In this work, we present the first systematic investigation into the vulnerabilities of mobile GUI agents. We introduce a scalable attack simulation framework named AgentHazard, which enables flexible and targeted modifications of screen content within existing applications. Leveraging this framework, we develop a comprehensive benchmark suite comprising both a dynamic task execution environment and a static dataset of state-rule pairs. The dynamic environment encompasses 122 reproducible tasks in an emulator with various types of hazardous UI content, while the static dataset consists of over 3,000 attack scenarios constructed from screenshots collected from a wide range of commercial apps. Importantly, our content modifications are designed to be feasible for unprivileged third parties. We perform experiments on 6 widely-used mobile GUI agents and 5 common backbone models using our benchmark. Our findings reveal that all examined agents are significantly influenced by misleading third-party contents (with an average misleading rate of 42.1\% and 40.7\% in dynamic and static environments, respectively). We also find that the vulnerabilities are closely linked to the perception modalities and backbone LLMs.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 5419

Loading