Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models

Anonymous

Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models

Anonymous

17 Apr 2023ACL ARR 2023 April Blind SubmissionReaders: Everyone

Abstract: Given the magnitude of the current Pre-trained Language Models (PLMs), conventional fine-tuning becomes increasingly challenging, therefore parameter-efficient tuning is now the focus of cutting-edge research. For PLMs to accomplish transferability, prior techniques in this field added tunable adapters into Multi-Head Attention (MHA) or/and Feed-Forward Network (FFN) of Transformer blocks. However, the ability of Layer Normalization (LayerNorm) for parameter-efficient tuning is disregarded while being a crucial component of Transformer architecture. In this paper, we first propose LN-tuning, which is time-efficient and performs better than BitFit with only half tunable parameters. Moreover, SOTA performance is achieved by the unified framework of combining prefix-tuning and LN-tuning. Lastly, LN-tuning is better understood by an ablation investigation and a visualization experiment of the bias and gain terms.

Paper Type: short

Research Area: Machine Learning for NLP

0 Replies

Loading