TL;DR: We disentangle data representations and selection algorithms for targeted instruction selection in LLM fine-tuning, and study them across tasks, budgets, and models, and unify many algorithms as approximate distance minimizers.
Abstract: Instruction fine-tuning of large language models (LLMs) often involves selecting a subset of instruction training data from a large candidate pool, using a small query set from the target task. Despite growing interest, the literature on targeted instruction selection remains fragmented and opaque: methods vary widely in selection budgets, often omit zero-shot baselines, and frequently entangle the contributions of key components. As a result, practitioners lack actionable guidance on selecting instructions for their target tasks. In this work, we aim to bring clarity to this landscape by disentangling and systematically analyzing the two core ingredients: data representation and selection algorithms. Our framework enables controlled comparisons across models, tasks, and budgets. We find that only gradient-based data representations choose subsets whose similarity to the query consistently predicts performance across datasets,models, and candidate pools. While no single method dominates, gradient-based representations paired with greedy round-robin selection often perform best on average at low budgets, but these gains diminish at larger budgets. Finally, we unify several existing selection algorithms as forms of approximate distance minimization between the selected subset and the query set, and support this view with new generalization bounds. More broadly, our findings provide critical insights and a foundation for more principled data selection in LLM fine-tuning. The code is available at https://github.com/dcml-lab/targeted-instruction-selection.
Lay Summary: Large language models are often adapted to a specific task by training them on examples of instructions and answers. In targeted instruction selection, the goal is to use a small set of examples from the target task to choose additional training examples from a much larger pool. This matters because training on every available example can be costly and may even make the model worse on the task we care about. The key question is: which examples should we select to best improve performance on the target task?
Many methods have been proposed for targeted instruction selection, but the existing results are hard to compare. Studies often use different settings, leave out simple baselines, or combine several design choices at once, making it unclear which parts of a method actually matter.
This paper studies instruction selection in a more controlled way. We separate two main choices: how each training example is represented for comparison, and which selection algorithm is used to choose examples from the larger pool based on that representation. We compare these choices across different models, tasks, and amounts of selected data. We find that gradient-based representations are the only ones for which closeness to the target examples consistently lines up with better fine-tuning results. No single method is best in every setting, but at small data budgets, gradient-based representations combined with a balanced greedy selection strategy often perform best on average. As the amount of selected data grows, the differences between methods become smaller.
We also show that several existing selection algorithms can be understood as trying to make the selected training examples close to the target examples, and we provide theory supporting this view. Overall, our study clarifies which parts of instruction selection matter most and gives practitioners more practical guidance for choosing training data when fine-tuning language models.
Link To Code: https://github.com/dcml-lab/targeted-instruction-selection
Primary Area: Deep Learning->Large Language Models
Keywords: targeted instruction selection, subset selection, instruction fine-tuning, post-training
Originally Submitted PDF: pdf
Submission Number: 8507
Loading