Improving Memory Efficiency for Training KANs via Meta Learning

Zhangchi Zhao; Jun Shu; Deyu Meng; Zongben Xu

Improving Memory Efficiency for Training KANs via Meta Learning

Zhangchi Zhao, Jun Shu, Deyu Meng, Zongben Xu

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We propose an alternative technique for improving scalability and extensibility of KANS via Meta-Learner

Abstract: Inspired by the Kolmogorov-Arnold representation theorem, KANs offer a novel framework for function approximation by replacing traditional neural network weights with learnable univariate functions. This design demonstrates significant potential as an efficient and interpretable alternative to traditional MLPs. However, KANs are characterized by a substantially larger number of trainable parameters, leading to challenges in memory efficiency and higher training costs compared to MLPs. To address this limitation, we propose to generate weights for KANs via a smaller meta-learner, called MetaKANs. By training KANs and MetaKANs in an end-to-end differentiable manner, MetaKANs achieve comparable or even superior performance while significantly reducing the number of trainable parameters and maintaining promising interpretability. Extensive experiments on diverse benchmark tasks, including symbolic regression, partial differential equation solving, and image classification, demonstrate the effectiveness of MetaKANs in improving parameter efficiency and memory usage. The proposed method provides an alternative technique for training KANs, that allows for greater scalability and extensibility, and narrows the training cost gap with MLPs stated in the original paper of KANs. Our code is available at \url{https://github.com/Murphyzc/MetaKAN}.

Lay Summary: Kolmogorov-Arnold Networks (KANs) are a new type of neural network that replaces fixed activation functions with more flexible, learnable mathematical functions. This allows them to better understand complex data and offers clearer insights into how decisions are made — a valuable property in scientific and engineering applications. However, this flexibility comes at a cost: KANs require many more parameters than traditional networks, which makes them memory-hungry and harder to train on large datasets. To solve this, we introduce MetaKANs — a new framework that uses a smaller network (a “meta-learner”) to generate the parameters needed by KANs. Instead of learning each function separately, MetaKANs learn how to generate them efficiently using shared patterns. This reduces memory usage while keeping performance strong. Our experiments show that MetaKANs match or even exceed standard KANs in accuracy across many tasks — from solving physics equations to recognizing images — using just a fraction of the memory. This makes advanced, interpretable neural networks more practical for real-world problems.

Link To Code: https://github.com/Murphyzc/MetaKAN

Primary Area: Deep Learning->Algorithms

Keywords: Hypernetwork;Kolmogorov-Arnold networks;Memory efficiency;Scalability and Extensibility

Submission Number: 14365

Loading