Information-based Value Iteration Networks for Decision Making Under Uncertainty

ICLR 2026 Conference Submission13955 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Learning, value iteration networks, planning under uncertainty
TL;DR: We proposed a novel deep architecture for decision making under uncertainty based on planning for reward maximization and information gathering.
Abstract: Deep neural networks that incorporate classic reinforcement learning methods, such as value iteration, into their structure significantly outperform randomly structured networks in learning and generalization. These networks, however, are mostly limited to environments with no or very low amounts of uncertainty. In this paper, we propose a new planning module architecture, the VI$^2$N (Value Iteration with Value of Information Network), that learns to act in novel environments with a high amount of perceptual ambiguity. This architecture over-emphasizes reducing uncertainty before exploiting the reward. VI$^2$N can also utilize factorization in environments with mixed observability to decrease the computational complexity of calculating the policy and facilitate learning. Tested on a diverse set of domains, each containing various types of environments, our network outperforms other deep architectures. Moreover, VI$^2$N generates interpretable cognitive maps highlighting both rewarding and informative locations. These maps highlight the key states the agent must visit to achieve its goal.
Primary Area: reinforcement learning
Submission Number: 13955
Loading