An anatomical substrate of credit assignment in reinforcement learning

J Kornfeld, Y Wang, M Januszewski, A Rother, P Schubert, M Goldman, V Jain, W Denk, MS Fee

Published: 19 Feb 2020, Last Modified: 20 Apr 2026CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: h3>Abstract</h3> <p>A key problem in learning is credit assignment. Biological systems lack a plausible mechanism to implement the backpropagation approach, a method that underlies much of the dramatic progress in artificial intelligence. Here, we use automated connectomic analysis to show that the synaptic architecture of songbird basal ganglia (Area X) supports local credit assignment using a variant of a node perturbation algorithm proposed in a model of reinforcement learning. Using two volume electron microscopy (vEM) datasets, we find that key predictions of the model hold true: axons that encode exploratory variability terminate predominantly on dendritic shafts, while axons that encode song timing (context) terminate predominantly on spines. Based on the detailed EM data, we then built a biophysical model of reinforcement learning that suggests that the synaptic dichotomy between variability and context encoding axons facilitates efficient learning. In combination, these findings provide strong evidence for a general, biologically plausible credit assignment model in vertebrate basal ganglia learning.</p><h3>One Sentence Summary</h3> <p>Using automated connectomic analysis and biophysical modeling, we show how the basal ganglia could solve the credit assignment problem on the synaptic level.</p>

External IDs:doi:10.1101/2020.02.18.954354