Can the Spectrum of the Neural Tangent Kernel Anticipate Fine-Tuning Performance?

Published: 10 Oct 2024, Last Modified: 19 Nov 2024AFM 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Fine-tuning, Linearized large models, Neural tangent kernel, Low-rank adaptation, LoRA
Abstract: Parameter-Efficient Fine-tuning (PEFT) offers a scalable and resource-efficient solution for adaptating large models. Despite its popularity, the mechanisms underlying the performance of PEFT in terms of empirical risk and generalization remain underexplored. In this paper, we provide new insights into fine-tuning by analyzing PEFT through the lens of kernel methods, specifically by examining the relationship between the Neural Tangent Kernel (NTK) spectrum and the effectiveness of fine-tuning. Our findings reveal a strong correlation between the NTK spectrum and the model's adaptation performance, shedding light on both empirical risk and generalization properties. We evaluate our theory with Low Rank Adaptation (LoRA) on large language models. These insights not only deepen our understanding of LoRA but also offer a novel perspective for enhancing other PEFT techniques, paving the way for more robust and efficient adaptation in large language models.
Submission Number: 128
Loading