Variational Low-Rank Adaptation Using IVON

Bai Cong; Nico Daheim; Yuesong Shen; Daniel Cremers; Rio Yokota; Mohammad Emtiyaz Khan; Thomas Möllenhoff

Variational Low-Rank Adaptation Using IVON

Bai Cong, Nico Daheim, Yuesong Shen, Daniel Cremers, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff

Published: 10 Oct 2024, Last Modified: 01 Nov 2024FITML 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large language model, Low-rank adaption, Variational learning, Model calibration

TL;DR: We use the IVON optimizer to improve the accuracy and calibration in LoRA finetuning for large language models.

Abstract: We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models. The code is available at https://github.com/team-approx-bayes/ivon-lora.

Submission Number: 56

Loading