Kolmogorov-Arnold Networks Still Catastrophically Forget but Differently from MLP

Published: 01 Jan 2025, Last Modified: 12 Nov 2025AAAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Catastrophic forgetting is when a neural network loses previously learnt information after learning a new task sequentially. Avoiding catastrophic forgetting could reduce the resources necessary to update neural networks. Recently, Kolmogorov–Arnold Networks (KAN) gained the community's attention as preliminary experiments suggest KAN avoid catastrophic forgetting. KAN replace neural network edges with learnable B-splines and sum incoming edges in nodes. Proponents of KAN argue they avoid forgetting, are more accurate, are interpretable, and use fewer parameters. Our work investigates the claims that KAN avoid catastrophic forgetting, finding that they fail to do so on more complex datasets containing features that overlap between tasks. We give a simple explanation as to why and how KAN catastrophically forget. Motivated by evidence suggesting KAN are superior for symbolic regression, we augment KAN in the same ways as multilayer perceptron (MLP) to perform continual learning tasks, making special accommodations to support KAN. Our experiments found that unmodified KAN often forget more than MLP, but KAN can be better than MLP when combined with continual learning strategies. We aim to highlight some of the current shortcomings and strengths associated with KAN for continual learning.
Loading