Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs

Jonas Hübotter; Sascha Bongni; Ido Hakimi; Andreas Krause

Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs

Jonas Hübotter, Sascha Bongni, Ido Hakimi, Andreas Krause

Published: 10 Oct 2024, Last Modified: 19 Nov 2024AFM 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: local learning, test-time fine-tuning, transductive learning, data selection, retrieval, active learning, transductive active learning, language modeling, uncertainty quantification

TL;DR: We develop SIFT, an effective data selection method for fine-tuning LLMs. We show that test-time fine-tuning with SIFT can significantly and robustly improve language modeling ability.

Abstract: Recent efforts in fine-tuning language models often rely on automatic data selection, commonly using Nearest Neighbors retrieval from large datasets. However, we theoretically show that this approach tends to select redundant data, limiting its effectiveness or even hurting performance. To address this, we introduce SIFT, a data selection algorithm designed to reduce uncertainty about responding to the prompt, which unifies ideas from retrieval and active learning. SIFT accounts for redundant information and optimizes the overall information gain of the selected examples. Our evaluations, focusing on prompt-specific fine-tuning at test-time, show that SIFT consistently outperforms Nearest Neighbor retrieval in language modeling on the Pile dataset, with minimal computational overhead. Whereas Nearest Neighbor retrieval typically fails in the presence of information duplication, SIFT is entirely robust to such cases.

Submission Number: 44

Loading