Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

Published: 11 Oct 2024, Last Modified: 10 Nov 2024M3L PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: continual learning, multi-task learning, sequential learning, statistical physics, optimal control theory, training dynamics
TL;DR: This theoretical work combines statistical physics and control theory to design optimal task-selection protocols mitigating catastrophic forgetting while preserving performance in continual learning, validated on synthetic and real-world data.
Abstract: Artificial neural networks often struggle with _catastrophic forgetting_ when learning tasks sequentially, as training on new tasks degrades the performance on earlier ones. Recent theoretical work tackled this issue by analysing learning curves in synthetic settings with predefined training protocols. However, these protocols were heuristic-based and lacked a solid theoretical foundation for assessing their optimality. We address this gap by combining exact training dynamics equations, derived using statistical physics, with optimal control methods. We apply this approach to teacher-student models of continual learning, obtaining a theory for task-selection protocols that optimise performance minimising forgetting. Our analysis offers non-trivial yet interpretable strategies, showing how optimal learning protocols modulate established effects, such as the influence of task similarity on forgetting. We validate our theoretical findings on real-world data.
Is Neurips Submission: No
Submission Number: 33
Loading