Keywords: Reinforcement Learning, Representation Learning
TL;DR: We utilize the structure of noise to propose a provable and practical exploration algorithm for representation learning in reinforcement learning, which has superior performance over the existing state-of-the-art algorithms on several benchmarks.
Abstract: Representation learning lies at the heart of the em- pirical success of deep learning for dealing with the curse of dimensionality. However, the power of representation learning has not been fully exploited yet in reinforcement learning (RL), due to i), the trade-off between expressiveness and tractability; and ii), the coupling between exploration and rep- resentation learning. In this paper, we first reveal the fact that under some noise assumption in the stochastic control model, we can obtain the lin- ear spectral feature of its corresponding Markov transition operator in closed-form for free. Based on this observation, we propose Spectral Dynam- ics Embedding (SPEDE), which breaks the trade- off and completes optimistic exploration for rep- resentation learning by exploiting the structure of the noise. We provide rigorous theoretical analysis of SPEDE, and demonstrate the practical superior performance over the existing state-of-the-art em- pirical algorithms on several benchmarks.
Supplementary Material: zip