Exploring Selective Layer Freezing Strategies in Transformer Fine-Tuning: NLI Classifiers with Sub-3B Parameter Models
Abstract: In recent years, methods that selectively fine-tune or reduce the number of layers in large language models (LLMs) have garnered attention as an efficient alternative to traditional fine-tuning, where all layers are trained. In this study, we revisit the classical concept of layer freezing and propose a simple, effective strategy that selectively fine-tunes only a portion of transformer layers. We show that freezing the bottom 25% or 50% of layers in small-scale LLMs with sub-3 billion parameters yields significant improvements in memory efficiency and training speed while maintaining, or even surpassing, the performance of full fine-tuning and Low-Rank Adaptation (LoRA). Through experiments on Natural Language Inference (NLI) tasks using LLMs with fewer than 3 billion parameters, our approach achieves up to 50% memory savings and 30% faster training. Notably, our method does not require architectural modifications or additional parameters, making it particularly suitable for resource-constrained environments.
External IDs:doi:10.3390/app151910434
Loading