EconAgentBench: Economic Benchmarks for LLM Agents in Unknown Environments

ICLR 2026 Conference Submission18807 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models, benchmarks, economics, pricing, stable matching, llm agents, llm agent benchmarks
TL;DR: We develop benchmarks for LLM agents that act in economic environments, to more richly understand LLM agent capabilities.
Abstract: We develop benchmarks for LLM agents that act in, learn from, and strategize in unknown economic environments, the specifications of which the LLM agent must learn over time from deliberate exploration. Our benchmarks consist of decision-making tasks derived from key problems in economics. To forestall saturation, the benchmark tasks are synthetically generated with scalable difficulty levels. Overall, our benchmarks assess the abilities of LLM agents in tackling complex economic problems in procurement, scheduling, and pricing—applications that should grow in importance as such agents are further integrated into the economy.
Primary Area: datasets and benchmarks
Submission Number: 18807
Loading