Meta-Reinforcement Learning reconciles surprise, value, and control in the anterior cingulate cortex

Tim Vriens, Eliana Vassena, Giovanni Pezzulo, Gianluca Baldassarre, Massimo Silvetti

Published: 01 Jan 2025, Last Modified: 24 Jul 2025PLoS Comput. Biol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The role of the dorsal anterior cingulate cortex (dACC) in cognition is a frequently studied yet highly debated topic in neuroscience. Most authors agree that the dACC is involved in either cognitive control (e.g., voluntary inhibition of automatic responses) or monitoring (e.g., comparing expectations with outcomes, detecting errors, tracking surprise). A consensus on which theoretical perspective best explains dACC contribution to behaviour is still lacking, as two distinct sets of studies report dACC activation in tasks requiring surprise tracking for performance monitoring and cognitive control without involving surprise monitoring, respectively. This creates a theoretical impasse, as no single current account can reconcile these findings. Here we propose a novel hypothesis on dACC function that integrates both the monitoring and the cognitive control perspectives in a unifying, meta-Reinforcement Learning framework, in which cognitive control is optimized by meta-learning based on tracking Bayesian surprise. We tested the quantitative predictions from our theory in three different functional neuroimaging experiments at the basis of the current theory crisis. We show that the meta-Reinforcement Learning perspective successfully captures all the neuroimaging results by predicting both cognitive control and monitoring functions, proposing a solution to the theory crisis about dACC function within an integrative framework. In sum, our results suggest that dACC function can be framed as a meta-learning optimisation of cognitive control, providing an integrative perspective on its roles in cognitive control, surprise tracking, and performance monitoring.