Fine-Tuning Pretrained Models with NVIB for Improved Generalisation

ACL ARR 2025 February Submission3325 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Fine-tuned pretrained attention-based models often struggle with generalisation, leading to poor performance in scenarios like out-of-domain transfer, distribution shifts, and few-shot learning. This limitation is prevalent across modalities such as speech, text, graphs, and vision. Our proposed method's background Nonparametric Variational Information Bottleneck (NVIB) is an attention-based information-theoretic regulariser applicable to pretrained models that has been shown to improve generalisation. However, prior work has applied NVIB only to the text modality and without fine-tuning. We investigate whether NVIB’s ability to remove information from pretrained embeddings helps the model avoid spurious correlations with noisy and superficial features during fine-tuning. We are the first to integrate NVIB regularisation during fine-tuning across multiple diverse models and modalities. This required modifications to the architecture which enhance adaptability and stability during fine-tuning and simplify the evaluation. We found improved out-of-distribution generalisation in: speech quality assessment and language identification, text with induced attention sparsity, graph-based link prediction, and image-based tasks, including few-shot classification and privacy classification.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: Nonparametric Variational Information Bottleneck, Regularisation, Fine-tuning, Transformers, Generalisation
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Theory
Languages Studied: English, Chinese, Czech, Dutch, Estonian, French, German, Italian, Polish, Romanian, Slovenian, Spanish, Arabic, Catalan, Georgian, Greek, Indonesian, Japanese, Kyrgyz, Latvian, Maltese, Persian, Portuguese, Russian, Swedish, Tamil, Turkish, Welsh
Submission Number: 3325
Loading