Keywords: catastrophic forgetting, linear systems
TL;DR: We improve on previous bounds for forgetting by precisely bounding the numerical range of a product of projections.
Abstract: We provide theoretical bounds on the forgetting quantity in the continual learning setting for linear tasks, where each round of learning corresponds to projecting onto a linear subspace. For a cyclic task ordering on $T$ tasks repeated $m$ times each, we prove the best known upper bound of $O(T^2/m)$ on the forgetting. Notably, our bound holds uniformly over all choices of tasks and is independent of the ambient dimension. Our main technical contribution is a characterization of the union of all numerical ranges of products of $T$ (real or complex) projections as a sinusoidal spiral, which may be of independent interest.
Supplementary Material: pdf
Submission Number: 11338
Loading