TL;DR: We show how intermediate representations in neural networks can be leveraged to improve both performance and efficiency in continual learning through auxiliary classifiers (ACs)
Abstract: Continual learning is crucial for applying machine learning in challenging, dynamic, and often resource-constrained environments. However, catastrophic forgetting — overwriting previously learned knowledge when new information is acquired — remains a major challenge. In this work, we examine the intermediate representations in neural network layers during continual learning and find that such representations are less prone to forgetting, highlighting their potential to accelerate computation. Motivated by these findings, we propose to use auxiliary classifiers~(ACs) to enhance performance and demonstrate that integrating ACs into various continual learning methods consistently improves accuracy across diverse evaluation settings, yielding an average 10\% relative gain. We also leverage the ACs to reduce the average cost of the inference by 10-60\% without compromising accuracy, enabling the model to return the predictions before computing all the layers. Our approach provides a scalable and efficient solution for continual learning.
Lay Summary: Standard machine learning models often suffer from *catastrophic forgetting*, where learning new data causes them to forget what they’ve previously learned. Continual learning aims to address this by enabling models to learn incrementally, without retraining from scratch.
In this work, we propose a simple and effective way to improve continual learning by adding small *auxiliary classifiers*~(ACs) to intermediate layers of the network. These classifiers operate on earlier features, which we demonstrate are more stable and less prone to forgetting.
Integrating ACs improves accuracy across a broad range of continual learning methods, yielding an average 10% relative gain. Surprisingly, on older data, intermediate classifiers often outperform the final classifier. They also enable early exits, allowing the model to make predictions without using all the layers, which can reduce the inference costs by up to 60% without compromising the accuracy.
Our approach is architecture-agnostic, easy to implement, and improves both performance and efficiency, making continual learning more practical and reliable in real-world settings
Link To Code: https://github.com/fszatkowski/cl-auxiliary-classifiers
Primary Area: General Machine Learning->Transfer, Multitask and Meta-learning
Keywords: Continual Learning, Class-incremental Learning, Early-exits
Submission Number: 1554
Loading