Memory Transplants for LLM Agents: Disentangling Architecture and Content Transfer under a Code-to-Math Shift
Keywords: memory-augmented agents, cross-domain transfer, memory architecture, experience replay, LLM agents, factorial experimental design
TL;DR: A memory transplant protocol disentangles architecture from content transfer across a code-to-math shift, finding that architecture transfer is system-dependent, static content transfer is limited, and weaker solvers benefit most (+15pp vs +7pp).
Abstract: Memory-augmented LLM agents accumulate experience to improve over time, but when transferring to a new domain, observed gains may stem from either the memory mechanism (how experiences are stored and retrieved) or the stored content (the experiences themselves). Prior cross-domain evaluations conflate these two factors, leaving it unclear which component generalizes. We introduce a memory transplant protocol that disentangles architecture from content by independently varying each across a code-to-math domain shift (LiveCodeBench to MATH). Using a 2x2 factorial design with seven transplant conditions, five memory systems spanning simple RAG to evolved multi-tier architectures, and six pre-registered validation gates, we evaluate two solver scales (Qwen 2.5 7B and Llama 3.2 3B) under both static (retrieval-only) and dynamic (full learning) regimes. We find that architecture transfer is system-dependent with no universal direction, and that content transfer in static mode provides limited benefit beyond a no-memory baseline. The most striking result is that solver capability moderates transfer magnitude: the weaker model shows gains up to +15 percentage points versus +7pp for the stronger, suggesting that memory transplantation is most valuable where intrinsic model capability is limited.
Submission Number: 106
Loading