Active Inference through Incentive Design in Partially Observable Markov Decision Processes
Keywords: Plan and goal recognition, Planning under uncertainty
Abstract: Active inference refers to a class of methods that influence or control observed information to minimize uncertainty about latent or unknown variables. In this paper, we study a class of active inference problems in which an agent (the leader), with only partial observations, seeks to infer the unknown type of another agent (the follower), whose interactions with a dynamic environment are modeled as a Markov decision process (MDP). Different follower types are characterized by distinct dynamics, reward functions, or both, and each follower acts optimally to maximize its own reward.
To improve inference accuracy and efficiency under imperfect observations, we introduce the paradigm of Active Inference through Incentive Design, wherein the leader strategically offers side payments (incentives) to elicit diverging observable behaviors from different follower types. This formulation leads to a leader–follower game in which the leader balances the trade-off between incentive cost and information gain, quantified by the entropy of the posterior distribution over follower types. We show that the resulting bi-level optimization problem can be reduced to a single-level one by leveraging the softmax temporal consistency between followers’ policies and value functions. This reduction enables an efficient first-order, gradient-based algorithm, where gradients are computed using observable operators from hidden Markov models. Experimental results in stochastic gridworld environments demonstrate that the proposed method significantly improves both the accuracy and efficiency of intent inference compared to systems without incentive mechanisms.
Area: Search, Optimization, Planning, and Scheduling (SOPS)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 1364
Loading