Hybrid Neural-Cognitive Models Reveal How Memory Shapes Human Reward Learning

Maria K Eckstein, Nathaniel D. Daw, Christopher Summerfield, Kevin J Miller

Published: 05 Feb 2026, Last Modified: 02 Feb 2026Nature Human BehaviorEveryoneCC BY-ND 4.0

Abstract: A longstanding challenge for psychology and neuroscience is to understand the transformations by which past experiences shape future behavior. Reward-guided learning is typically modeled using simple reinforcement learning (RL) algorithms1–3. In RL, a handful of incrementally-updated internal variables both summarize past rewards and drive future choice. Here, we describe work that questions the assumptions of many RL models. We adopt a hybrid modeling approach that integrates artificial neural networks into interpretable cognitive architectures, estimating a maximally general form for each algorithmic component and systematically evaluating its necessity and sufficiency4. Applying this method to a large dataset of human reward-learning behavior, we show that successful models require independent and flexible memory variables that can track rich representations of the past. Using a modelling approach that combines predictive accuracy and interpretability, these results call into question an entire class of popular RL models based on incremental updating of scalar reward predictions.