Improving the Efficacy of Test-Time Steering in Masked Diffusion Models with Parallel Tempering

Published: 28 May 2026, Last Modified: 28 May 2026GenBio 2026 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Masked Diffusion Models, Parallel Tempering, Test-Time Steering, Generative AI for Biology
Abstract: Masked Diffusion Models (MDMs) provide expressive generative priors for discrete biological sequences such as proteins and DNA. However, many downstream tasks require steering these models at inference time to optimize arbitrary, external reward functions. Existing test-time steering methods face a fundamental exploration--exploitation trade-off in multimodal reward landscapes: they either collapse into suboptimal or unrealistic modes or require massive sampling budgets to find rare, high-reward states. We address this tension with Parallel Tempering for MDMs (PT-MDM). To adapt parallel tempering to MDM's generation, we tightly couple the reward temperature to the model's sequence remasking fraction. Hot replicas apply aggressive remasking for global exploration, while cold replicas use conservative remasking for targeted refinement, and periodic replica exchange ensures rare discoveries propagate across replica chains. This framework enables both global exploration and local exploitation without compromising sequence plausibility. Experiments on inverse protein folding and regulatory DNA design show that PT-MDM consistently outperforms test-time baselines, approaches fine-tuned reward performance on key metrics, and preserves sample fidelity without any training.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 96
Loading