Fast exploration and learning of latent graphs with aliased observationsDownload PDF

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: graph learning, fast exploration, aliased environments, POMDPs
Abstract: We consider the problem of quickly recovering the structure of a latent graph by navigating in it, when the agent can only perform stochastic actions and ---crucially--- different nodes may emit the same observation. This corresponds to learning the transition function of a partially observable Markov decision process (POMDP) in which observations are deterministic. This is highly relevant for partially observed reinforcement learning, where the agent needs to swiftly learn how to navigate new environments from sensory observations. The challenge involves solving two related problems: exploring the graph as fast as possible, and learning it from the obtained aliased observations, where the learning helps to explore faster. Our approach leverages a recently proposed model, the Clone Structured Cognitive Graph (CSCG), which can handle aliasing, and guide exploration. We provide empirical evidence that our model-based algorithm can recover graphs from a wide range of challenging topologies, and shows linear scaling with graph size even for severely aliased and loopy graph structures where model-free methods require an exponential number of steps.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)
1 Reply

Loading