Abstraction for Bayesian Reinforcement Learning in Factored POMDPs

TMLR Paper4004 Authors

17 Jan 2025 (modified: 26 Jan 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Bayesian reinforcement learning provides an elegant solution to addressing the exploration-exploitation trade-off in Partially Observable Markov Decision Processes (POMDPs) when the environment’s dynamics and reward function are initially unknown. By maintaining a belief over these unknown components and the state, the agent can effectively learn the environment’s dynamics and optimize their policy. However, scaling Bayesian reinforcement learning methods to large problems remains to be a significant challenge. While prior work has leveraged factored models and online sample-based planning to address this issue, these approaches often retain unnecessarily complex models and factors within the belief space that have minimal impact on the optimal policy. While this complexity might be necessary for accurate model learning, in reinforcement learning, the primary objective is not to recover the ground truth model but to optimize the policy for maximizing the expected sum of rewards. Abstraction offers a way to reduce model complexity by removing factors that are less relevant to achieving high rewards. In this work, we propose and analyze the integration of abstraction with online planning in factored POMDPs. Our empirical results demonstrate two key benefits. First, abstraction reduces model size, enabling faster simulations and thus more planning simulations within a fixed runtime. Second, abstraction enhances performance even with a fixed number of simulations due to greater statistical strength. These results underscore the potential of abstraction to improve both the scalability and effectiveness of Bayesian reinforcement learning in factored POMDPs.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Martha_White1
Submission Number: 4004
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview