Theta sequences as eligibility traces: A biological solution to credit assignmentDownload PDF

01 Mar 2023 (modified: 12 May 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone
Keywords: RL, neuroscience, credit assignment, reinforcement learning, theta, sequences, learning, prediction errors, eligibility traces, temporal difference learning, oscillations
TL;DR: Neural oscillations provide a mechanism for efficient long-term credit assignment in biological networks of neurons
Abstract: Credit assignment problems, for example policy evaluation in RL, often require bootstrapping prediction errors through preceding states or maintaining temporally extended memory traces; solutions which are unfavourable or implausible for biological networks of neurons. We propose theta sequences -- chains of neural activity during theta oscillations in the hippocampus, thought to represent rapid playthroughs of awake behviour -- as a solution. By analysing and simulating a model for theta sequences we show they compress behaviour such that existing but short $\mathsf{O}(10)$ ms neuronal memory traces are effectively extended allowing for bootstrap-free credit assignment without long memory traces, equivalent to the use of eligibility traces in TD($\lambda$).
4 Replies