Efficiently Estimating Data Efficiency for Language Model Fine-tuning

Efficiently Estimating Data Efficiency for Language Model Fine-tuning

ICLR 2026 Conference Submission18545 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, finetuning, data efficiency, task difficulty, annotation cost reduction

TL;DR: We explore and propose metrics to efficiently estimate the fine-tuning data size required to achieve a desired performance level.

Abstract: While large language models (LLMs) demonstrate reasonable zero-shot capability across many downstream tasks, fine-tuning is a common practice to improve their performance. However, a task's \textit{data efficiency} --- i.e., the number of fine-tuning examples needed to achieve a desired level of performance --- is often unknown, resulting in costly cycles of incremental annotation and retraining. Indeed, we demonstrate across a curated set of 30 specialized tasks that performant LLMs may struggle zero-shot but can attain stronger performance after fine-tuning. This motivates the need for methods to predict a task's data efficiency \textit{without} requiring incremental annotation. After introducing a concrete metric that quantifies a task's data efficiency, we propose using the \textit{gradient cosine similarity of low-confidence examples} as a way to predict data efficiency based on a small number of labeled samples. We validate our approach on the collected set of tasks with varying data efficiencies, attaining 8.6 % error in overall data efficiency prediction and eliminating hundreds of unnecessary annotations. Our experiment results and implementation code are available in the supplementary material.

Supplementary Material: zip

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 18545

Loading