Keywords: web agents, code agents, skill induction, web automation
Abstract: Skill induction has been widely studied as a way for web agents to accumulate reusable knowledge across tasks, but almost entirely under action-based frameworks where skills remain bound to a particular scaffolding. We instead study skill induction on code-native web agents, where skills take the form of standalone Playwright functions rather than compositions over a fixed action vocabulary. On WebArena-Verified, moving to a code-native substrate changes the skill-induction picture in three connected ways. First, a hybrid code agent that combines browser-tool navigation with Playwright script execution sets a substantially stronger no-skill baseline, outperforming a BrowserGym action-based agent on the same backbone model by 10.3 pp on the 104-task subset. Second, on this stronger baseline naive skill induction can actively degrade performance and prompt-level self-verification offers only marginal correction; we trace this to confirmation bias and address it with a multi-agent pipeline that decouples solving, verification, and updating into independent stages, admitting skills only through a gated mechanism. The full pipeline reaches 67.2%, a 6.4 pp improvement over the no-skill code agent at fixed skill format, and 10.3 pp above BrowserGym + ASI on the same 104-task subset. Third, the induced skills are standard Python functions and portability follows naturally from the representation: a preliminary case study shows that the same skill files drop into a different agent framework via directory copy, with no adaptation step.
Presentation Mode: Yes, at least one author will attend and present in person.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 100
Loading