Lost in the Middle: An Emergent Property from Information Retrieval Demands in LLMs

ICLR 2026 Conference Submission21588 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: language modeling, lost-in-the-middle phenomenon, attention dynamics, human memory parallels
TL;DR: The lost-in-the-middle phenomenon in LLMs can arise through fundamental human memory tasks with different retrieval demands.
Abstract: The performance of Large Language Models (LLMs) often degrades when crucial information appears in the middle of a long context, a “lost-in-the-middle” phenomenon that mirrors the primacy and recency effects in human memory. We propose that this behavior is not simply a flaw indicative of information loss but an adaptation to different information retrieval demands during pre-training: some tasks require uniform recall across the entire input (a long-term memory demand), while others prioritize the most recent information (a short-term memory demand). Consistent with this view, we show that this U-shaped performance curve emerges when LLMs (GPT-2 and Llama variants) are trained from scratch on two simple human memory paradigms simulating long-term and short-term memory demands. Our analysis reveals that while the recency effect directly aligns with short-term memory demand in the training data, the primacy effect is induced by the uniform long-term memory demand and is further influenced by the model's autoregressive properties and the formation of attention sinks. Our main findings from simple human memory paradigms also generalize to a sequence completion task, which more closely resembles the next-token prediction process used in LLM pre-training. Together, our findings reveal how information retrieval demands, model architecture, and structural attention dynamics during model training can jointly produce positional bias observed in LLMs.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 21588
Loading