WebEvolver: Enhancing Web Agent Self-Improvement with Co-evolving World Model

WebEvolver: Enhancing Web Agent Self-Improvement with Co-evolving World Model

ACL ARR 2025 May Submission726 Authors

15 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Agent self-improvement, where agents autonomously train their underlying Large Language Model (LLM) on self-sampled trajectories, shows promising results but often stagnates in web environments due to limited exploration and under-utilization of pretrained web knowledge. To improve the performance of self-improvement, we propose a novel framework that introduces a co-evolving World Model LLM. This world model predicts the next observation based on the current observation and action within the web environment. The World Model serves dual roles: (1) as a virtual web server generating self-instructed training data to continuously refine the agent's policy, and (2) as an imagination engine during inference, enabling look-ahead simulation to guide action selection for the agent LLM. Experiments in real-world web environments (Mind2Web-Live, WebVoyager, and GAIA-web) show a 10\% performance gain over existing self-evolving agents, demonstrating the efficacy and generalizability of our approach, without using any distillation from more powerful close-sourced models.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: LLM/AI agents, world model, self improvement

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Keywords: agent, world model, self improvement, large language model

Submission Number: 726

Loading