ColorBrowserAgent: Complex Long-Horizon Browser Agent with Adaptive Knowledge Evolution

Published: 18 Apr 2026, Last Modified: 23 Apr 2026ACL 2026 Industry Track OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: GUI Agent, Human in the loop, Web Automation
TL;DR: ColorBrowserAgent bridges site heterogeneity and long-horizon instability through adaptive human-in-the-loop knowledge evolution and compressed memory, delivering SOTA performance in complex web tasks.
Abstract: With the advancement of vision-language models, web automation has made significant progress. However, deploying autonomous agents in real-world settings remains challenging, primarily due to site heterogeneity, where generalist models lack domain-specific priors for diverse interfaces, and long-horizon instability, characterized by the accumulation of decision drift over extended interactions. To address these challenges, we introduce ColorBrowserAgent (Complex Long-Horizon Browser Agent), a knowledge-evolving agent for robust web automation. Our approach addresses these challenges through two synergistic mechanisms: human-in-the-loop knowledge adaptation that transforms sparse human feedback into reusable domain knowledge, and knowledge-aligned progressive summarization that stabilizes long interactions through memory compression. Extensive experiments on WebArena, WebChoreArena and industrial deployment show that ColorBrowserAgent consistently outperforms strong baselines. It achieves a state-of-the-art success rate of 71.2% on WebArena and maintains 47.4% performance under zero-shot transfer setting on WebChoreArena. In commercial deployment, it improves user satisfaction by 19.3% relatively, verifying its robustness in real-world scenarios.
Submission Type: Deployed
Copyright Form: pdf
Submission Number: 142
Loading