2013 (modified: 25 Jan 2025)ACML 2013Readers: Everyone
Abstract:We extend the Q-learning algorithm from the Markov Decision Process setting to problems where observations are non-Markov and do not reveal the full state of the world i.e. to POMDPs. We do this in...