EconAgentBench: Economic Benchmarks for LLM Agents in Unknown Environments

Sara Fish; Julia Shephard; Minkai Li; Ran I Shorrer; Yannai A. Gonczarowski

EconAgentBench: Economic Benchmarks for LLM Agents in Unknown Environments

Sara Fish, Julia Shephard, Minkai Li, Ran I Shorrer, Yannai A. Gonczarowski

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, benchmarks, economics, pricing, stable matching, llm agents, llm agent benchmarks

TL;DR: We develop benchmarks for LLM agents that act in economic environments, to more richly understand LLM agent capabilities.

Abstract: We develop benchmarks for LLM agents that act in, learn from, and strategize in unknown economic environments, the specifications of which the LLM agent must learn over time from deliberate exploration. Our benchmarks consist of decision-making tasks derived from key problems in economics. To forestall saturation, the benchmark tasks are synthetically generated with scalable difficulty levels. Overall, our benchmarks assess the abilities of LLM agents in tackling complex economic problems in procurement, scheduling, and pricing—applications that should grow in importance as such agents are further integrated into the economy.

Primary Area: datasets and benchmarks

Submission Number: 18807

Loading