Abstract: Parameter-efficient fine-tuning (PEFT) techniques, such as adapter tuning, aim to fine-tune a pre-trained language model (PLM) using a minimal number of parameters for a specific task or profile. Although adapter tuning provides increased parameter efficiency compared to full-model fine-tuning, it introduces a small set of additional parameters attached to a PLM for each profile. This can become problematic in practical applications with multiple profiles, particularly when a significant increase in the number of profiles linearly boosts the total number of additional parameters. To mitigate this issue, we introduce X-PEFT, a novel PEFT method that leverages a multitude of given adapters by fine-tuning an extremely small set of compact tensors for a new profile, which serve as binary masks to adaptively select the given adapters. To efficiently validate our proposed method, we implement it using a large number of random adapters instead of learned ones. Remarkably, this can be understood as an adapter-based version of the supermask concept, aligning with the principles of the Lottery Ticket Hypothesis. We evaluate the performance of X-PEFT through GLUE tasks and demonstrate that it either matches or surpasses the effectiveness of conventional adapter tuning, despite reducing the memory requirements per profile by a factor of 10,000 compared to it.
Paper Type: long
Research Area: Machine Learning for NLP
Contribution Types: Approaches to low-resource settings
Languages Studied: English
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies
Loading