AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

Yunhao Feng, Yifan Ding, Yingshui Tan, Xingjun Ma, Yige Li, Yutao Wu, Yifeng Gao, Kun Zhai, Yanming Guo

Published: 2026, Last Modified: 17 May 2026CoRR 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading