TL;DR: Continual learning is most effective when adjacent tasks are dissimilar and the final tasks are representative of all learning tasks.
Abstract: Continual learning of multiple tasks remains a major challenge for neural networks. Here, we investigate how task order influences continual learning and propose a strategy for optimizing it. Leveraging a linear teacher-student model with latent factors, we derive an analytical expression relating task similarity and ordering to learning performance. Our analysis reveals two principles that hold under a wide parameter range: (1) tasks should be arranged from the least representative to the most typical, and (2) adjacent tasks should be dissimilar. We validate these rules on both synthetic data and real-world image classification datasets (Fashion-MNIST, CIFAR-10, CIFAR-100), demonstrating consistent performance improvements in both multilayer perceptrons and convolutional neural networks. Our work thus presents a generalizable framework for task-order optimization in task-incremental continual learning.
Lay Summary: Neural networks often forget old skills when they learn new ones, a problem called “catastrophic forgetting.” This study shows that the order in which a network learns its tasks can make a big difference. By analyzing a simple mathematical model and then testing real image-recognition problems, we discovered two practical rules: start with the tasks that look least like the average of the whole set, and avoid placing two very similar tasks back-to-back. Following these guidelines consistently boosted accuracy on datasets such as Fashion-MNIST and CIFAR-10/100, even when the rules were estimated from just a tiny fraction of data. The work offers a recipe for arranging learning tasks so that AI systems retain old knowledge while smoothly acquiring new skills.
Primary Area: Theory->Deep Learning
Keywords: Lifelong Learning, Curriculum Learning
Submission Number: 5263
Loading