Keywords: continual,learning,cascade,model,hierarchical,Bayesian,online,consolidation,multi-task-learning,deep,neural,networks,hyperparameters,dnn,transfer,memo
TL;DR: Learning of multiple tasks in deep networks through a model that combines Bayesian learning with the cascade model.
Abstract: Continual learning poses an important challenge to machine learning models. Kirkpatrick et al. introduced a model that combats forgetting during continual learning by using a Bayesian prior to transfer knowledge between task switches. This approach showed promising results but the algorithm was given access to the time points when tasks were switched. Using a model of stochastic learning dynamics we show that this model is very closely related to the previously developed cascade model to combat catastrophic forgetting. This general formulation allows us to use the model also for online learning where no knowledge about task switching times is given to the network. Also it allows us to use deeper hierarchies of Bayesian priors. We evaluate this model on the permuted MNIST task. We demonstrate improved task performance during task switching, but find that online learning is still significantly worse when task switching times are unknown to the network.
Category: Stuck paper: I hope to get ideas in this workshop that help me unstuck and improve this paper
1 Reply
Loading