Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems

Published: 22 Oct 2024, Last Modified: 01 Nov 2024NeurIPS 2024 Workshop Open-World Agents PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Web Agents, Multi-Agent Systems, Multi-Step Planning and Reasoning
TL;DR: We introduce Agent-E, a novel web agent that outperforms existing systems in WebVoyager by 10-30%.
Abstract: AI Agents are changing the way work gets done, both in consumer and enterprise domains. However, the design patterns and architectures to build highly capable agents or multi-agent systems are still developing, and the understanding of the implication of various design choices and algorithms is still evolving. In this paper, we present our work on building a novel web agent, Agent-E. Agent-E introduces numerous architectural improvements over prior state-of-the-art web agents such as hierarchical architecture, flexible DOM distillation and denoising method, and the concept of \textit{change observation} to guide the agent towards more accurate performance. We first present the results of an evaluation of Agent-E on WebVoyager benchmark dataset and show that Agent-E beats other SOTA text and multi-modal web agents on this benchmark in most categories by 10-30\%. We then synthesize our learnings from the development of Agent-E into general design principles for developing agentic systems. These include the use of domain-specific primitive skills, the importance of distillation and de-noising of environmental observations, and the advantages of a hierarchical architecture.
Submission Number: 6
Loading