Assessing Diversity Collapse in Reasoning

Published: 08 Mar 2025, Last Modified: 08 Mar 2025SSI-FM PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, reasoning, supervised finetuning, reinforcement learning, decoding strategy
Abstract: We identify a striking phenomenon in large language models finetuned on reasoning tasks: as Pass@1 improves during supervised finetuning, Pass@k rapidly deteriorates and fails to recover with reinforcement learning or self-improvement. We formalize the relationship between expected Pass@k and Pass@1 over the test distribution and attribute the early drop in Pass@k to diversity collapse—where fine-tuning causes the probability mass to converge toward a single reasoning path and final answer for test questions. We theoretically prove how the standard finetuning strategy of SFT and RL leads to diversity collapse in reasoning models. Finally, we estimate the optimal Pass@k performance achievable with an oracle given access to the model's distribution over final answers marginalized over all rollouts and reveal a significant gap compared to current token-level diverse decoding methods such as temperature scale, top-k, nucleus, and min-p sampling. We highlight the need for better decoding strategies for generating reasoning steps during self-improvement and inference. Finally, we propose a promising solution by model weight interpolation.
Submission Number: 78
Loading