Keywords: Large Language Models, AI Agents, MCTS
Abstract: AI agents leveraging the capabilities of Large Language Models (LLMs) and Reinforcement Learning (RL) techniques have garnered growing attention due to their commendable performance in autonomously executing real-world tasks. Effective exploration of the action space is paramount for the successful accomplishment of diverse tasks by these AI agents. In this paper, we propose an enhanced approach for $\textbf{R}$apid $\textbf{E}$xploration and e$\textbf{X}$ploitation of action space for LLM-based AI agents, called $\textbf{REX}$. Existing LLM-driven agents have inherent limitations, such as a heavy reliance on precise descriptions for decision-making, and the lack of a systematic approach to leverage try-and-fail procedures akin to traditional RL. To overcome these challenges,
REX introduces an additional layer of rewards and integrates concepts similar to Upper Confidence Bound (UCB) scores, leading to more robust and efficient AI agent performance. The decision-making process of the agent, which involves predicting the next best action, is influenced by harnessing UCB scores. This approach has the advantage of enabling the utilization of offline behaviors from logs and allowing seamless integration with existing foundation models while it does not require any model fine-tuning.This is made possible because this method does not require model fine-tuning. Through comparative analysis with existing methods such as Chain-of-Thought (CoT) and Reflexion, REX-based methods demonstrate comparable performance and, in certain cases, even surpass the results achieved by these existing techniques. Notably, REX-based methods exhibit remarkable reductions in execution time while systematically exploring the action space of AI agents, enhancing their practical applicability across a diverse set of scenarios.
Submission Number: 20
Loading