Semiparametric Reinforcement Learning

Mika Sarkin Jain, Jack Lindsey

Feb 12, 2018 (modified: Feb 12, 2018) ICLR 2018 Workshop Submission readers: everyone
  • Abstract: We introduce a semiparametric approach to deep reinforcement learning inspired by complementary learning systems theory in cognitive neuroscience. Our approach allows a neural network to integrate nonparametric, episodic memory-based computations with parametric statistical learning in an end-to-end fashion. We give a deep Q network access to intermediate and final results of a differentiable approximation to k-nearest-neighbors performed on a dictionary of historic state-action embeddings. Our method displays the early-learning advantage associated with episodic memory-based algorithms while mitigating the asymptotic performance disadvantage suffered by such approaches. In several cases we find that our model learns even more quickly from few examples than pure kNN-based approaches. Analysis shows that our semiparametric algorithm relies heavily on the kNN output early on and less so as training progresses, which is consistent with complementary learning systems theory.
  • TL;DR: Combining parametric and nonparametric methods in RL problems yields fast learning while maintaining good final performance.
  • Keywords: deep learning, nonparametric, episodic learning, nearest neighbors, complementary learning systems, reinforcement learning