SkillCompiler: A Unified Compilation Framework for Cross-Platform LLM Agent Skills

Yipeng Ouyang; Yi Xiao; Yuhao Gu; Xianwei Zhang

SkillCompiler: A Unified Compilation Framework for Cross-Platform LLM Agent Skills

Yipeng Ouyang, Yi Xiao, Yuhao Gu, Xianwei Zhang

Published: 15 May 2026, Last Modified: 22 May 2026AgentSkills 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Agents, Skill Compilation, Prompt Engineering, Intermediate Representation, Security Hardening

TL;DR: SkillCompiler uses a shared intermediate representation to automate agent skill adaptation across platforms, enhancing security and performance while reducing manual rewriting and token costs.

Abstract: LLM-Agents have evolved into autonomous systems for complex task execution, with the SKILL.md specification and progressive disclosure emerging as the de facto standard for encapsulating agent capabilities. However, a critical bottleneck remains: different agent frameworks exhibit starkly different sensitivities to prompt formatting, causing up to 40% performance variation. Currently, nearly all skills exist as a single, format-agnostic Markdown version. Manual per-platform rewriting creates an unsustainable $O(m \times n)$ maintenance burden, while an audit of 3,984 community skills reveals that 37% contain security vulnerabilities. To address this, we present SkillCompiler, a compilation framework that introduces classical compiler design into agent skill development. Through a four-phase pipeline of frontend format validation, intermediate representation construction, automatic safety constraint enforcement (Anti-Skill Injection), and polymorphic backend emission, SkillCompiler reduces adaptation complexity from $O(m \times n)$ to $O(m+n)$. Experiments on SkillsBench demonstrate that compiled skills consistently outperform their original counterparts, improving pass rates from 21.1% to 33.3% on Claude Code and from 35.1% to 48.7% on Kimi CLI. Engineering metrics validate the system's pragmatism: achieving sub-10ms compilation latency per skill, a 94.8% proactive security trigger rate, and a 10-46% improvement in runtime token efficiency across platforms.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 65

Loading