Experience-Guided Behavior Adaptation for Large Language Models

Iknoor Singh; Harjot Singh; Abhishek Tripathi; Niresh Agarwal; Murat Sensoy

Experience-Guided Behavior Adaptation for Large Language Models

Iknoor Singh, Harjot Singh, Abhishek Tripathi, Niresh Agarwal, Murat Sensoy

Published: 23 May 2026, Last Modified: 03 Jun 2026CATS@ICML26 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, In-context learning, Continual learning in LLMs, Post-training adaptation, Retrieval-augmented generation, Parameter-efficient fine-tuning

Abstract: Large language models (LLMs) cannot accumulate experience across interactions without parameter updates. Retrieval-augmented generation and memory-based approaches attempt to leverage past interactions but typically rely on semantic similarity alone and ignore whether experiences actually improve performance. We introduce an uncertainty-aware guidance framework that distills compact guidance from past failures and selects it via a contextual bandit formulation. Each guidance item maintains a Beta posterior over effectiveness, and Thompson sampling balances exploration and exploitation, allowing the model to downweight unhelpful guidance over time. Across benchmarks, our method corrects up to 69.5% of prior errors and improves Haiku 4.5 accuracy by up to 26%. Notably, guidance distilled from a weaker open-weight model (Qwen3 4B) transfers effectively to a stronger proprietary model (Haiku 4.5), demonstrating experience exchange across models in the context space.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 53

Loading