Keywords: POMDPs, partial observability, planning, belief state modeling, embeddings
TL;DR: We present a planning algorithm for POMDPs that tracks the distribution over hidden states as a latent vector.
Abstract: We present Belief Embedding Tree Search (BETS), a novel planning algorithm for Partially Observable Markov Decision Processes (POMDPs).
Effective planning hinges on accurately approximating the agent's belief state, yet existing methods become prohibitively expensive as belief states grow.
BETS addresses this by compressing beliefs into fixed-length *embeddings* that are updated with each new observation.
Conditioning a generative model on these embeddings enables Monte-Carlo planning over the approximate belief states.
An initial evaluation in the standard benchmark *PocMan*---restricted to a reduced grid size---shows promising results compared to similarly budgeted particle filtering baselines.
This highlights the potential for BETS to scale online planning to larger POMDPs.
Submission Number: 9
Loading