Learning Evolving Latent Strategies for Multi-Agent Language Systems without Model Fine-Tuning

Published: 02 Mar 2026, Last Modified: 15 Apr 2026ICLR 2026 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM Agents, Continual Learning, External Memory, Latent Space, Reflection, Reinforcement Learning, Multi-Agent Systems
TL;DR: We show that large language model agents can achieve continual strategy evolution without fine-tuning by updating an external latent space through reflection and reinforcement feedback.
Abstract: This study proposes a multi-agent language framework that enables continual strategy evolution without fine-tuning the language model’s parameters. The core idea is to liberate the latent vectors of abstract concepts from traditional “static semantic representations,” allowing them to be continuously updated through environmental interaction and reinforcement feedback.We construct a dual-loop architecture:the behavior loop adjusts action preferences based on environmental rewards,while the language loop updates the external latent vectors by reflecting on the semantic embeddings of generated text. Together, these mechanisms allow agents to develop stable and disentangled strategic styles over long-horizon multi-round interactions.Experiments show that agents’ latent spaces exhibit clear convergence trajectories under reflection-driven updates, along with structured shifts at critical moments. Moreover, the system demonstrates an emergent ability to implicitly infer and continually adopt emotional agents, even without shared rewards.These results indicate that, without modifying model parameters, an external latent space can provide language agents with a low-cost, scalable, and interpretable form of abstract strategic representation
Submission Number: 15
Loading