T*: Progressive Block Scaling for MDM Through Trajectory Aware RL

T*: Progressive Block Scaling for MDM Through Trajectory Aware RL

ACL ARR 2026 January Submission9169 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion Language Models, Reinforcement Learning

Abstract: We present ${\rm T}^\star$, a simple TraceRL-based training curriculum for progressive block-size scaling in masked diffusion language models (MDMs). Starting from an AR-initialized small-block MDM, ${\rm T}^\star$ transitions smoothly to larger blocks, enabling higher-parallelism decoding with minimal performance degradation on math reasoning benchmarks. Moreover, further analysis suggests that ${\rm T}^\star$ can converge to an alternative decoding schedule $\hat{\rm S}$ that achieves comparable performance.

Paper Type: Short

Research Area: Resources and Evaluation

Research Area Keywords: Diffusion Language Model, Reinforcement Learning, Denoising Schedule

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 9169

Loading