Abstract: Highlights•We focus on the setting that the training data of previous tasks are unavailable.•We propose a novel memory efficient data-free distillation method.•Our method encodes knowledge of previous datasets into parameters for distillation.•Our method shows superiority on multiple continual learning benchmark datasets.
Loading