Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning

Justin Lovelace; Christian K Belardi; Sofian Zalouk; Adhitya Polavaram; Srivatsa R Kundurthy; Kilian Q Weinberger

Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning

Justin Lovelace, Christian K Belardi, Sofian Zalouk, Adhitya Polavaram, Srivatsa R Kundurthy, Kilian Q Weinberger

Published: 08 Jul 2025, Last Modified: 26 Aug 2025COLM 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion, latent diffusion, language generation

TL;DR: We introduce a unified architecture that pauses autoregressive text generation for latent diffusion planning, enabling higher quality and more controllable text generation with improved language understanding.

Abstract: The Stop-Think-AutoRegress Language Diffusion Model (STAR-LDM) integrates latent diffusion planning with autoregressive generation. Unlike conventional autoregressive language models limited to token-by-token decisions, STAR-LDM incorporates a ``thinking'' phase that pauses generation to refine a semantic plan through diffusion before continuing. This enables global planning in continuous space prior to committing to discrete tokens. Evaluations show STAR-LDM significantly outperforms similar-sized models on language understanding benchmarks and achieves >70% win rates in LLM-as-judge comparisons for narrative coherence and commonsense reasoning. The architecture also allows straightforward control through lightweight classifiers, enabling fine-grained steering of attributes without model retraining while maintaining better fluency-control trade-offs than specialized approaches.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 1720

Loading