ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL

Published: 16 Sept 2025, Last Modified: 16 Sept 2025CoRL 2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: RL, Memory, POMDP, Robotics
TL;DR: ELMUR is a transformer with layer-local external memory and LRU-based updates for long-horizon reasoning.
Abstract: Real-world robotic agents must act under partial observability and long horizons, where key cues may appear long before they affect decision making. Standard recurrent or transformer models struggle: context windows truncate history, and naive memory extensions fail under scale and sparsity. We propose ELMUR (External Layer Memory with Update/Rewrite), a transformer architecture with structured external memory. Each layer maintains memory embeddings, interacts with them via bidirectional cross-attention, and updates them through an Least Recently Used (LRU) memory module using replacement or convex blending. This design extends effective horizons up to 100,000 times beyond the context length. On synthetic benchmarks and robotic manipulation tasks from MIKASA-Robo with sparse rewards, ELMUR consistently outperforms other strong baselines, achieving robust long-term recall and showing promising results where prior models struggle. Our results show that structured external memory is a simple and effective recipe for scalable decision making under partial observability.
Submission Number: 16
Loading