- Keywords: deep learning, nonparametric, episodic learning, nearest neighbors, complementary learning systems, reinforcement learning
- TL;DR: Combining parametric and nonparametric methods in RL problems yields fast learning while maintaining good final performance.
- Abstract: We introduce a semiparametric approach to deep reinforcement learning inspired by complementary learning systems theory in cognitive neuroscience. Our approach allows a neural network to integrate nonparametric, episodic memory-based computations with parametric statistical learning in an end-to-end fashion. We give a deep Q network access to intermediate and final results of a differentiable approximation to k-nearest-neighbors performed on a dictionary of historic state-action embeddings. Our method displays the early-learning advantage associated with episodic memory-based algorithms while mitigating the asymptotic performance disadvantage suffered by such approaches. In several cases we find that our model learns even more quickly from few examples than pure kNN-based approaches. Analysis shows that our semiparametric algorithm relies heavily on the kNN output early on and less so as training progresses, which is consistent with complementary learning systems theory.