Parameter-Efficient Fine-Tune on Open Pre-trained Transformers for Genomic Sequence

Huixin Zhan; Zijun Frank Zhang

Parameter-Efficient Fine-Tune on Open Pre-trained Transformers for Genomic Sequence

Huixin Zhan, Zijun Frank Zhang

Published: 27 Oct 2023, Last Modified: 29 Nov 2023GenBio@NeurIPS2023 PosterEveryoneRevisionsBibTeX

Keywords: Parameter efficient fine-tune, Open pre-trained transformers, Genomic data, Language models

TL;DR: We propose a method using adaptive low-rank adaptation and random sampling to efficiently fine-tune open pre-trained transformers on genomic sequences.

Abstract: Lately, pre-trained foundation models (PFMs) in DNA have achieved notable advancements in unraveling the linguistic nuances of the genome. As these foundational models expand in parameters and the number of downstream genomic tasks increases, Parameter-Efficient Fine-Tuning (PEFT) has become the de facto approach to fine-tune PFMs while decreasing the computational costs. Low-rank adapters and adaptive low-rank adaptation (AdaLoRA) are popular PEFT methods that introduce some learnable truncated singular value decomposition modules for efficient fine-tuning. However, both methods are deterministic, i.e., once a singular value is pruned, it stays pruned throughout the fine-tuning process. Consequently, deterministic PEFTs can underperform if the initial states, before pruning, are suboptimal—a challenge frequently encountered in genomics due to data heterogeneity. To address this issue, we propose an AdaLoRA with random sampling (AdaLoRA+RS) to prune and stochastically reintroduce pruned singular vectors, adhering to a cubic budget schedule. We evaluate the AdaLoRA+RS on PFMs within genome domain, DNABERT 1/2 and Nucleotide Transformer; and language domain, open pre-trained transformers (OPT). Our AdaLoRA+RS approach demonstrates performance ranging from slightly above to on par with the Full-Model Fine-Tuning (FMFT) across $13$ genomic sequence datasets on two genome understanding tasks, while using less than $2\%$ of the trainable parameters. For instance, in the human promoter detection, OPT-$350$M with AdaLoRA+RS achieves a $4.4\%$ AUC increase compared to its FMFT baseline, leveraging only $1.8\%$ of the trainable parameters. Our proposed AdaLoRA+RS provides a powerful PEFT strategy for modeling genomic sequence.

Submission Number: 32

Loading