Beyond Experience: Fictive Learning as an Inherent Advantage of World Models

Published: 19 Sept 2025, Last Modified: 19 Sept 2025NeurIPS 2025 Workshop EWMEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Fictive learning; Reinforcement learning; two-step task
TL;DR: We integrate model-based reinforcement learning with fictive learning, through simulations and animal experiments, we highlight fictive learning as a biologically plausible and natural way by which world models enhance learning and decision making.
Abstract: Reinforcement learning (RL) provides a normative computational framework for reward-based decision making, where world models play a central role in enabling efficient learning and flexible planning. Classical RL algorithms are based on experienced outcomes, whereas humans and animals may generalize learning to unexperienced events based on internal world models, so-called fictive learning. We propose a simple, brain-inspired fictive learning rule to augment model-based RL and use the rodent two-step task to examine whether fictive learning can better explain the observed behavior and improve performance by better sample efficiency. The learning rule uses the same reward prediction error (RPE) to update both experienced and unexperienced states and actions, with scaling by the event correlation inferred from the internal model for fictive update. Through simulations, we show that this model achieves the highest accuracy and better reproduces key behavioral traits observed in the two-step task. Model fitting validates its superior fit over existing alternatives. Furthermore, the model replicates striatal dopaminergic dynamics observed in the same task, suggesting the brain might operate fictive learning for reward-based learning. The fictive learning observed here is conceptually analogous to approaches in machine learning, such as off-policy learning and counterfactual reasoning. These results suggest that fictive learning could be an inherent advantage of world models, highlighting its role as both a natural component of model-based decision making and an indispensable principle for more efficient learning algorithms utilizing world models.
Submission Number: 62
Loading