Discovering neo-Hebbian plasticity rules for reward-driven training of RNNs

Dimitra Maoutsa

Discovering neo-Hebbian plasticity rules for reward-driven training of RNNs

Dimitra Maoutsa

Published: 22 Sept 2025, Last Modified: 22 Sept 2025WiML @ NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: meta-learning, plasticity, credit assignment, reinforcement learning, RNNs

Abstract: Biological circuits routinely solve complex tasks that require learning from sparse, delayed reinforcement, yet the synaptic mechanisms that enable long-timescale credit assignment remain poorly understood. Artificial recurrent neural networks (RNNs), by contrast, are trained with backpropagation through time or related algorithms that rely on biologically unrealistically feedback connectivity structure and non-local information. While prior work has largely focused on hand-crafted synaptic updates for unsupervised neural circuit self-organization, or biologically plausible approximations of backpropagation, the space of plasticity rules capable of supporting structured credit assignment from delayed feedback remains vastly underexplored. Here, we introduce a meta-learning framework that discovers families of three-factor plasticity rules capable of training recurrent networks from sparse reward feedback. In our formulation, synaptic updates combine local eligibility traces (capturing pre- and postsynaptic interactions) with delayed reinforcement signals that modulate weight changes. The eligibility traces are not fixed but parameterized as polynomial expansions of neural activity, enabling richer pre/post interactions beyond classical Hebbian forms. These parameters are optimized in an outer meta-learning loop using analytically derived gradient estimates inspired by the REINFORCE gradient estimator, allowing learning rules themselves to adapt to task requirements. Our results show that different plasticity rules yield qualitatively distinct network dynamics, representations, and learning trajectories. We further analyze the emergent dynamical structures obtained by each rule, revealing how plasticity shapes computation in recurrent circuits. We demonstrate that meta-learning can uncover biologically grounded synaptic rules that enable structured credit assignment without global supervision. Beyond offering candidate mechanisms for cortical learning, our approach provides insights on how plasticity shapes neural computation.

Submission Number: 331

Loading