Uncertainty-guided Lifelong Learning in Bayesian Networks

Sayna Ebrahimi; Mohamed Elhoseiny; Trevor Darrell; Marcus Rohrbach

Uncertainty-guided Lifelong Learning in Bayesian Networks

Sayna Ebrahimi, Mohamed Elhoseiny, Trevor Darrell, Marcus Rohrbach

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Sequentially learning of tasks arriving in a continuous stream is a complex problem and becomes more challenging when the model has a fixed capacity. Lifelong learning aims at learning new tasks without forgetting previously learnt ones as well as freeing up capacity for learning future tasks. We argue that identifying the most influential parameters in a representation learned for one task plays a critical role to decide on \textit{what to remember} for continual learning. Motivated by the statistically-grounded uncertainty defined in Bayesian neural networks, we propose to formulate a Bayesian lifelong learning framework, \texttt{BLLL}, that addresses two lifelong learning directions: 1) completely eliminating catastrophic forgetting using weight pruning, where a hard selection mask freezes the most certain parameters (\texttt{BLLL-PRN}) and 2) reducing catastrophic forgetting by adaptively regularizing the learning rates using the parameter uncertainty (\texttt{BLLL-REG}). While \texttt{BLLL-PRN} is by definition a zero-forgetting guaranteed method, \texttt{BLLL-REG}, despite exhibiting some small forgetting, is a task-agnostic lifelong learner, which does not require to know when a new task arrives. This feature makes \texttt{BLLL-REG} a more convenient candidate for applications such as robotics or on-line learning in which such information is not available. We evaluate our Bayesian learning approaches extensively on diverse object classification datasets in short and long sequences of tasks and perform superior or marginally better than the existing approaches.

Keywords: lifelong learning, continual learning, sequential learning

TL;DR: We formulate lifelong learning in the Bayesian-by-Backprop framework, exploiting the parameter uncertainty in two settings: for pruning network parameters and in importance weight based continual learning.

8 Replies

Loading