Track: long paper (up to 8 pages)
Keywords: Diffusion Language Models, discrete diffusion
TL;DR: Eso-LMs are a new hybrid language model family that bridges autoregressive and masked diffusion modeling, unlocks full KV caching for fast inference, and achieves a new state of the art on the generation speed–quality Pareto frontier.
Abstract: Diffusion-based language models offer a compelling alternative to autoregressive
(AR) models by enabling parallel and controllable generation. Within this family,
Masked Diffusion Models (MDMs) currently perform best but still underperform
AR models in perplexity and lack key inference-time efficiency features, most
notably KV caching. We introduce Esoteric Language Models (Eso-LMs), a new
family of models that fuses AR and MDM paradigms, smoothly interpolating
between their perplexities while overcoming their respective limitations. Unlike
prior work, which uses transformers with bidirectional attention as MDM denoisers,
we exploit the connection between MDMs and Any-Order autoregressive models
and adopt causal attention. This design lets us compute the exact likelihood of
MDMs for the first time and, crucially, enables us to introduce exact KV caching
for MDMs while preserving parallel generation over the full sequence length
for the first time, significantly improving inference efficiency. Combined with an
optimized sampling schedule, Eso-LMs achieves a new state of the art on the
speed-quality Pareto frontier for unconditional generation. On longer contexts, it
yields 14 − 65× faster inference than standard MDMs and 3 − 4× faster inference
than prior semi-autoregressive approaches. We provide code, model checkpoints, and a talk recording on the project page: https://s-sahoo.com/Eso-LMs.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 58
Loading