JARVIS or Ultron? A Survey on the Safety and Security Threats of Computer-Using Agents

JARVIS or Ultron? A Survey on the Safety and Security Threats of Computer-Using Agents

ACL ARR 2025 July Submission1076 Authors

29 Jul 2025 (modified: 24 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Recently, AI-driven interactions with computing devices have advanced from basic prototype tools to sophisticated, LLM-based systems that emulate human-like operations in graphical user interfaces. We are now witnessing the emergence of Computer-Using Agents (CUAs), capable of autonomously performing tasks such as navigating desktop applications, web pages, and mobile apps. However, as these agents grow in capability, they also introduce novel safety and security risks. Vulnerabilities in LLM-driven reasoning, with the added complexity of integrating multiple software components and multimodal inputs, further complicate the security landscape. In this paper, we present a systematization of knowledge on the safety and security threats of CUAs. We conduct a comprehensive literature review and distill our findings along four research objectives: (i) define the CUA that suits safety analysis; (ii) categorize current safety threats among CUAs; (iii) propose a comprehensive taxonomy of existing defensive strategies; (iv) summarize prevailing benchmarks, datasets, and evaluation metrics used to assess the safety and performance of CUAs. Building on these insights, our work provides future researchers with a structured foundation for exploring unexplored vulnerabilities and offers practitioners actionable guidance in designing and deploying secure Computer-Using Agents.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: ethical considerations in NLP applications

Contribution Types: Surveys

Languages Studied: English

Reassignment Request Area Chair: This is not a resubmission

Reassignment Request Reviewers: This is not a resubmission

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: Yes

A2 Elaboration: After limitation

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: Section3，4，5 and reference section.

B2 Discuss The License For Artifacts: N/A

B2 Elaboration: Research use

B3 Artifact Use Consistent With Intended Use: N/A

B3 Elaboration: Research use

B4 Data Contains Personally Identifying Info Or Offensive Content: N/A

B4 Elaboration: No Personally Identifying Info Or Offensive Content

B5 Documentation Of Artifacts: N/A

B5 Elaboration: No new data

B6 Statistics For Data: N/A

B6 Elaboration: No new data

C Computational Experiments: No

C1 Model Size And Budget: N/A

C2 Experimental Setup And Hyperparameters: N/A

C3 Descriptive Statistics: N/A

C4 Parameters For Packages: N/A

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 1076

Loading