Keywords: Web Agents; World Model; Planning; MPC
TL;DR: This paper presents a preliminary study on leveraging the internal world model of complex environments within LLMs to help web agents strike a balance between accuracy and efficiency.
Abstract: Language agents have shown promising performance in automating web-based tasks, but the complexity and vast search spaces of real-world websites challenge reactive agents in identifying optimal solutions. While tree search agents offer enhanced exploration by interacting with actual websites, they often incur high costs, potential risks, and are challenging to implement for real-world websites. This paper explores a novel paradigm leveraging large language models' (LLMs) internal world models for planning in complex environments, presenting a middle ground between reactive agents and tree search agents. Results on two representative benchmarks, VisualWebArena and Mind2Web-live, demonstrate that our approach largely closes the gap between reactive agents and tree search agents, while maintaining efficiency and safety advantages. Notably, tree search can be considered as approaching an upper bound for our method, as it explores actual websites rather than simulations. This work opens new avenues for research into more effective and secure strategies for autonomous agents in complex, dynamic environments. It represents a step forward in improving upon reactive agents while approaching the performance of tree search methods, without incurring their implementation challenges and costs.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13327
Loading