BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

Yibin Wang; Haizhou Shi; Ligong Han; Dimitris N. Metaxas; Hao Wang

BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

Yibin Wang, Haizhou Shi, Ligong Han, Dimitris N. Metaxas, Hao Wang

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Bayesian Neural Network, Finetuning, Large Language Models

TL;DR: We introduce a principled Bayesian framework for improving large language models' generalization and uncertainty estimation.

Abstract: Large Language Models (LLMs) often suffer from overconfidence during inference, particularly when adapted to downstream domain-specific tasks with limited data. Previous work addresses this issue by employing approximate Bayesian estimation after the LLMs are trained, enabling them to quantify uncertainty. However, such post-training approaches' performance is severely limited by the parameters learned during training. In this paper, we go beyond post-training Bayesianization and propose Bayesian Low-Rank Adaptation by Backpropagation (BLoB), an algorithm that continuously and jointly adjusts both the mean and covariance of LLM parameters throughout the whole fine-tuning process. Our empirical results verify the effectiveness of BLoB in terms of generalization and uncertainty estimation, when evaluated on both in-distribution and out-of-distribution data.

Primary Area: Probabilistic methods (for example: variational inference, Gaussian processes)

Submission Number: 8582

Loading