Keywords: diffusion language model, masked diffusion language model, prompt engineering, prompt infilling
Abstract: Masked diffusion language models possess infilling capabilities, yet current supervised fine-tuning practices use response-only masking which prevents models from learning to infill prompt tokens. We first show that publicly available masked diffusion language models exhibit limited infilling capability for prompts. This limitation stems from a training-inference gap where models never encounter prompt masking during supervised fine-tuning stage. To address this, we introduce a two-stage training framework during supervised fine-tuning (SFT): (1) full-sequence masking to enable prompt infilling, followed by (2) response-only masking to preserve downstream task accuracy. This simple approach enables models to infill optimal prompts from few-shot examples at inference time. Evaluating on math, multi-hop reasoning, and LLM-as-a-Judge, we show that our training framework unlocks prompt infilling capabilities. Our results suggest that the training framework is the primary bottleneck for unlocking prompt infilling capability in masked diffusion language models.
Paper Type: Long
Research Area: Language Models
Research Area Keywords: prompting,fine-tuning
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 8412
Loading