Skill Induction for Code Agents on Web Automation

Demi Wang; Lintang Sutawika; Graham Neubig

Skill Induction for Code Agents on Web Automation

Demi Wang, Lintang Sutawika, Graham Neubig

Published: 15 May 2026, Last Modified: 15 May 2026AgentSkills 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: web agents, code agents, skill induction, web automation

Abstract: Skill induction has been widely studied as a way for web agents to accumulate reusable knowledge across tasks, but almost entirely under action-based frameworks where skills remain bound to a particular scaffolding. We instead study skill induction on code-native web agents, where skills take the form of standalone Playwright functions rather than compositions over a fixed action vocabulary. On WebArena-Verified, moving to a code-native substrate changes the skill-induction picture in three connected ways. First, a hybrid code agent that combines browser-tool navigation with Playwright script execution sets a substantially stronger no-skill baseline, outperforming a BrowserGym action-based agent on the same backbone model by 10.3 pp on the 104-task subset. Second, on this stronger baseline naive skill induction can actively degrade performance and prompt-level self-verification offers only marginal correction; we trace this to confirmation bias and address it with a multi-agent pipeline that decouples solving, verification, and updating into independent stages, admitting skills only through a gated mechanism. The full pipeline reaches 67.2%, a 6.4 pp improvement over the no-skill code agent at fixed skill format, and 10.3 pp above BrowserGym + ASI on the same 104-task subset. Third, the induced skills are standard Python functions and portability follows naturally from the representation: a preliminary case study shows that the same skill files drop into a different agent framework via directory copy, with no adaptation step.

Presentation Mode: Yes, at least one author will attend and present in person.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 100

Loading