Value of Information and Reward Specification in Active Inference and POMDPs

Ran Wei

Value of Information and Reward Specification in Active Inference and POMDPs

Ran Wei

Published: 10 Oct 2024, Last Modified: 20 Nov 2024NeuroAI @ NeurIPS 2024 SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Active inference, Bayesian RL, value of information

TL;DR: We show that active inference approximates the Bayes optimal RL policy

Abstract: Active inference is an agent modeling framework with roots in Bayesian predictive coding and the free energy principle. In recent years, active inference has gained popularity in modeling sequential decision making, a problem set traditionally populated by reinforcement learning (RL). Instead of optimizing expected reward as in RL, active inference agents optimize expected free energy (EFE), which has an intuitive decomposition into a pragmatic and an epistemic component. This makes us wonder: what's the EFE-optimizing agent's optimality gap compared with a reward-driven RL agent, which is well understood? By casting EFE under a particular class of belief MDP and using analysis tools from RL theory, we show that EFE approximates the Bayes optimal RL policy via information value. We discuss the implications for objective specification of active inference agents.

Submission Number: 2

Loading