Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems

Tamer Abuelsaad; Deepak Akkil; Prasenjit Dey; Ashish Jagmohan; Aditya Vempaty; Ravi Kokku

Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems

Tamer Abuelsaad, Deepak Akkil, Prasenjit Dey, Ashish Jagmohan, Aditya Vempaty, Ravi Kokku

Published: 22 Oct 2024, Last Modified: 01 Nov 2024NeurIPS 2024 Workshop Open-World Agents PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Web Agents, Multi-Agent Systems, Multi-Step Planning and Reasoning

TL;DR: We introduce Agent-E, a novel web agent that outperforms existing systems in WebVoyager by 10-30%.

Abstract: AI Agents are changing the way work gets done, both in consumer and enterprise domains. However, the design patterns and architectures to build highly capable agents or multi-agent systems are still developing, and the understanding of the implication of various design choices and algorithms is still evolving. In this paper, we present our work on building a novel web agent, Agent-E. Agent-E introduces numerous architectural improvements over prior state-of-the-art web agents such as hierarchical architecture, flexible DOM distillation and denoising method, and the concept of \textit{change observation} to guide the agent towards more accurate performance. We first present the results of an evaluation of Agent-E on WebVoyager benchmark dataset and show that Agent-E beats other SOTA text and multi-modal web agents on this benchmark in most categories by 10-30\%. We then synthesize our learnings from the development of Agent-E into general design principles for developing agentic systems. These include the use of domain-specific primitive skills, the importance of distillation and de-noising of environmental observations, and the advantages of a hierarchical architecture.

Submission Number: 6

Loading