Abstract: In this work, we first analyze the relationship between the performance of different layers and their pre-trained matrix using SVD. Based on this, we design the Singular-Value Based Adaptive Low-Rank Adaption (SARA), which adaptively finds the suitable rank for each layer during initialization.
Additionally, we explore the Mixture-of-SARA (Mo-SARA), which significantly reduces the number of parameters by fine-tuning only multiple parallel sets of singular values controlled by a router.
Extensive experiments on various complex tasks have demonstrated the state-of-the-art performance and parameter efficiency of our methods.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: parameter-efficient-training
Contribution Types: Approaches to low-resource settings, Surveys
Languages Studied: English
Submission Number: 169
Loading