CUA-Skill: Developing Computer Using Agents with a Skill Framework

Published: 27 May 2026, Last Modified: 09 Jun 2026CompLearn 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Computer Using Agent, Compositional Graph, Skill, Agent Infrastructure
Abstract: Computer-Using Agents (CUAs) aim to autonomously operate computer systems to complete real-world tasks. However, existing agentic systems remain difficult to scale and lag behind human performance. A key limitation is the absence of reusable and structured skill abstractions that capture how humans interact with graphical user interfaces and how to leverage these skills. We introduce CUA-Skill, a computer-using agentic skill base that encodes human computer-use knowledge as skills coupled with parameterized execution and composition graphs. CUA-Skill is a large-scale library of carefully engineered skills spanning common Windows applications, serving as a practical infrastructure and tool substrate for scalable, reliable agent development. Built upon this skill base, we construct CUA-Skill Agent, an end-to-end computer-using agent that supports dynamic skill retrieval, argument instantiation, and memory-aware failure recovery. Empirically, CUA-Skill substantially improves the quality and reliability of trajectory generation, achieving a 76.4% success rate, which is multiple times higher than existing baselines. On the challenging end-to-end WindowsAgentArena benchmark, CUA-Skill Agent further attains state-of-the-art performance with a 57.5% best-of-three success rate, while remaining significantly more efficient than prior and concurrent approaches. Together, CUA-Skill serves as a strong and scalable foundation for building future CUA systems.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 96
Loading