ATLAS: Actor-Critic Task-completion with Look-ahead Action Simulation

Published: 23 Sept 2025, Last Modified: 22 Nov 2025LAWEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Web Agent, LLM, Planning, Cognitive Map
TL;DR: We introduce ATLAS: a web task completion agent that plans before acting by simulating outcomes using a specialized memory called a "cognitive map". We achive SOTA results on WebArena-Lite, substantially close the gap to human performance.
Abstract: We introduce ATLAS (Actor-Critic Task-completion with Look-ahead Action Simulation), a web agent that combines a hierarchical planner with an internal world-model to simulate action outcomes to figure out the best course of action before execution. ATLAS starts by building a "cognitive map" - a simple world-model of the environment - by performing a lightweight curiosity driven exploration of the environment. The planner proposes candidate actions; a simulator predicts their consequences in natural language; a critic analyzes the options to select the best roll-out; and a browser executor performs the chosen action. On the WebArena benchmarks, ATLAS attains state of the art success, exceeding prior reported systems by a large margin. On the WebArena-lite Benchmark; ATLAS achieves a ∼63% success score compared to 54% for the previously published state-of-the art. Unlike previous systems, our modular architecture requires no website-specific model finetuning. Ablations show sizable drops without the world-model, hierarchical planner, and lookahead-based replanner confirming their complementary roles.
Submission Type: Research Paper (4-9 Pages)
Submission Number: 28
Loading