Graph-of-Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

Dawei Liu; Zongxia Li; Hongyang Du; Xiyang Wu; Shihang Gui; Yongbei Kuang; Lichao Sun

Graph-of-Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

Dawei Liu, Zongxia Li, Hongyang Du, Xiyang Wu, Shihang Gui, Yongbei Kuang, Lichao Sun

Published: 15 May 2026, Last Modified: 25 May 2026AgentSkills 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Computer Use Agents, Agent Skills, Graph of Skills, Skill Retrieval

TL;DR: Prebuild skil graph for more effective and efficient skill retrieval and usage

Abstract: Skill usage has become a core component of modern agent systems and can substantially improve agents' ability to complete complex tasks. In real-world settings, where agents must monitor and interact with numerous personal applications, web browsers, and other environment interfaces, skill libraries can scale to thousands of reusable skills. Scaling to larger skill sets introduces two key challenges. First, loading the full skill set saturates the context window, driving up token costs, hallucination, and latency. In this paper, we present Graph of Skills (GoS), an inference-time structural retrieval layer for large skill libraries. GoS constructs an executable skill graph offline from skill packages, then at inference time retrieves a bounded, dependency-aware skill bundle through hybrid semantic-lexical seeding, reverse-weighted Personalized PageRank, and context-budgeted hydration. On SkillsBench and ALFWorld, GoS consistently delivers substantial reward improvements and token savings across three model families (Claude Sonnet, GPT-5.2 Codex, and MiniMax). Notably, on SkillsBench using GPT-5.2 Codex, it achieves a peak reward increase of 25.55% while slashing input tokens by 56.72% over the vanilla full skill-loading baseline. Additional ablation studies across skill libraries ranging from 200 to 2,000 skills further demonstrate that GoS consistently outperforms both vanilla skills loading and simple vector retrieval in balancing reward, token efficiency, and runtime.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 14

Loading