Corrector Sampling in Language Models

Itai Gat; Neta Shaul; Uriel Singer; Yaron Lipman

Corrector Sampling in Language Models

Itai Gat, Neta Shaul, Uriel Singer, Yaron Lipman

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Language modeling, Sampling

TL;DR: We introduce Resample-Previous-Tokens (RPT), a sampling method that allows models to revisit and replace previously generated tokens, leading to ~10% relative improvements in reasoning and coding after a short fine-tuning.

Abstract: Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by iteratively revisiting and potentially replacing tokens in a window of previously generated text. Fine-tuning a pretrained 8B parameter model with RPT for only 100B resulted in ~10% relative improvements on reasoning and coding benchmarks compared to the standard sampling.

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 13366

Loading