Benchmarking In-context Experiential Learning Through Repeated Product Recommendations

Benchmarking In-context Experiential Learning Through Repeated Product Recommendations

ICLR 2026 Conference Submission22692 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: In-context learning, Experiential Learning, Multi-episode learning, Multi-turn learning, Uncertainty resolution, Interactive learning, Sparse feedback, Recommendation, Dataset

Abstract: To reliably navigate ever-shifting real-world environments, agents must grapple with incomplete knowledge and adapt their behavior through experience. However, current evaluations largely focus on tasks that leave no ambiguity, and do not measure agents' ability to adaptively learn and improve as they accrue experience. We exemplify the need for in-context experiential learning in a product recommendation context, where agents must navigate shifting customer preferences and product landscapes through natural language dialogue. We curate BIEL: a benchmark that combines i) rich real-world products from Amazon, ii) a diverse collection of user personas to represent heterogeneous yet latent preferences, and iii) a LLM user simulator powered by the persona to create realistic and interactive trajectories. We observe that current frontier models struggle to meaningfully improve across episodes, underscoring the need for agentic systems with strong in-context experiential learning capabilities.

Primary Area: datasets and benchmarks

Submission Number: 22692

Loading