Keywords: Low-rank adaptation, LoRA, Fairness, Evaluations, LLMs, Large Models
TL;DR: Compared to full-model fine-tuning, does low-rank adaptation (LoRA) have any side effects on model fairness? How do we know?
Abstract: Low-rank adaptation of large models for downstream tasks, as exemplified by LoRA, has gained traction due to its computational efficiency. This efficiency, contrasted with the prohibitive costs of full-model fine-tuning, means that practitioners often turn to LoRA, sometimes without fully exploring its ramifications. In this pilot study, we focus on the fairness implications of LoRA, examining its impact on the performance of different subgroups for a given fine-tuning task compared to a full-model fine-tuning baseline. We conduct extensive experiments across vision and language domains and classification and generation tasks on ViT-Base, Swin-v2-Large, Llama-2 7B, and Mistral 7B. Our findings reveal a nuanced landscape: while it is possible to cherry-pick specific instances where LoRA exacerbates bias among subgroups, we found no significant evidence suggesting a consistent pattern of such disparities across the board. Our study also highlights challenges in assessing fine-tuning fairness for generative tasks in terms of task design and model token bias, urging more rigorous and careful fairness evaluations.
Submission Number: 9
Loading