Hindsight Merging: Diverse Data Generation with Language Models

Veniamin Veselovsky; Benedikt Stroebl; Gianluca Bencomo; Dilip Arumugam; Lisa Schut; Arvind Narayanan; Thomas L. Griffiths

Hindsight Merging: Diverse Data Generation with Language Models

Veniamin Veselovsky, Benedikt Stroebl, Gianluca Bencomo, Dilip Arumugam, Lisa Schut, Arvind Narayanan, Thomas L. Griffiths

Published: 07 May 2025, Last Modified: 28 Jul 2025UAI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: reasoning, diverse data generation, model merging

TL;DR: Propose a technique of merging base and instruction-tuned language models to maintain diverse generation and instruction following. Show that this model outperforms a non-merged model in creating reasoning traces to finetune distilled models.

Abstract: Pre-training a language model equips it with a broad understanding of the world, while fine- tuning refines it into a helpful assistant. However, fine-tuning does not exclusively enhance task- specific behaviors but also suppresses some of the beneficial variability from pre-training. This reduction in diversity is partly due to the optimization process, which theoretically decreases model entropy in exchange for task performance. To counteract this, we introduce hindsight merging, a technique that combines a fine-tuned model with a previous training checkpoint using linear interpolation to restore entropy and improve performance. Hindsight-merged models retain strong instruction-following capabilities and alignment while displaying increased diversity present in the base model. Additionally, this results in improved inference scaling, achieving a consistent 20-50% increase in pass@10 relative to the instruction tuned model across a coding benchmark and series of models. Our findings suggest that hindsight merging is an effective strategy for generating diverse generations that follow instructions.

Latex Source Code: zip

Code Link: https://github.com/benediktstroebl/hindsight-merging

Readers: auai.org/UAI/2025/Conference, auai.org/UAI/2025/Conference/Area_Chairs, auai.org/UAI/2025/Conference/Reviewers, auai.org/UAI/2025/Conference/Submission758/Authors, auai.org/UAI/2025/Conference/Submission758/Reproducibility_Reviewers

Submission Number: 758

Loading