Abstract: In continual learning (CL), the newly arrived data are often out-of-distribution from the previous ones, causing drastic representation shift (RS) when updating the old model on the new data, leading to catastrophic forgetting. In this work, we propose feature boosting calibration (FBC) to tackle this problem. Specifically, an expanded module is trained to learn all the classes, including the old and new classes, discovering critical features missed by the original/old model. Then, an FBC network (FBCN) is trained to exploit these missed features to calibrate the old representations. As the missed features increase the information needed for distinguishing between the old and new classes, FBCN generates the calibrated ones with more transferable features, thus alleviating the RS. Next, given the limited memory to store samples of the old/learned classes, the data are severely imbalanced between the old and new classes. To cope with this problem, we propose blockwise knowledge distillation (BWKD), which splits the softmax layer into blocks according to class frequency and then distills each block separately, resolving data imbalance effectively. Building upon the two improvements, we propose a two-stage training framework for CL, named CKDF-V2, providing an enhanced version of the cascaded knowledge distillation framework (CKDF). Furthermore, we integrate it with a task-token expansion method to develop a novel approach for CL based on the vision transformer (ViT). Extensive experiments show that both a convolutional neural network (CNN) and ViT-based CKDF-V2 obtain favorable results across multiple CL benchmarks.
External IDs:dblp:journals/tnn/LiCWY25
Loading