NEOL: REWARD-GATED ONLINE PLASTICITY FOR SCALABLE NEUROEVOLUTION

ICLR 2026 Conference Submission20028 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: NeuroEvolution, Online Learning, Synaptic Plasticity, Optimisation, Evolutionary Algorithm
Abstract: NeuroEvolution of Augmenting Topologies (NEAT) excels at discovering neural architectures and weights for control tasks (Stanley & Miikkulainen, 2002a).However, direct-encoding forces evolution to discover each connection strength individually; in high-dimensional weight spaces, this yields weak credit assignment and poor scaling on large continuous-control problems (Stanley et al., 2009; Peng et al., 2018). We propose NeuroEvolutionary Online Learning (NEOL), which decouples learning signals: the outer loop uses NEAT for topology search, while an inner, reward-modulated local plasticity rule (Hebbian, Oja, or BCM (Hebb, 1949; Oja, 1982; Bienenstock et al., 1982)) adapts synaptic weights online within episodes. Under fixed interaction budgets and multiple seeds across four standard control benchmarks spanning discrete and continuous action spaces, NEOL achieves higher final returns, tighter variability, and better sample efficiency than pure NEAT; gains are most pronounced in continuous control. These improvements are statistically significant (Wilcoxon rank-sum tests), and ablations indicate that benefits persist even when standard genetic weight mutation is reduced or disabled, evidencing a division of labour between structural evolution and online synaptic credit assignment. A simple, gradient-free separation of topology search and reward-gated online plasticity reliably boosts performance and robustness, offering a practical template for linking neuroevolution with online learning and a scalable path toward more adaptive neuroevolutionary agents.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 20028
Loading