Keywords: masked diffusion models, autoregressive models, expressiveness
Abstract: This paper formally studies generation processes, including auto-regressive next-token prediction and masked diffusion, that abstract beyond architectural specifics. At this level of abstraction, we take first steps towards quantifying the benefits of generation processes being ``natural", intuitively referring to those that align with underlying physical processes, through measurable criteria such as computational hardness and learnability. In particular, we demonstrate that allowing generation to proceed beyond autoregression and current masked diffusion, with capabilities to rewrite and edit, can bring significant theoretical and empirical advantages, with important implications for frontier LLMs that aspire to tackle increasingly hard problems and work universally across domains beyond natural language, such as coding and science.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 14419
Loading