Full-length mRNA Design with Reward-Guided Masked Diffusion Model Fine-Tuning

Sawan Patel; Sophia Tang; Pranam Chatterjee; Sherwood Yao

Full-length mRNA Design with Reward-Guided Masked Diffusion Model Fine-Tuning

Sawan Patel, Sophia Tang, Pranam Chatterjee, Sherwood Yao

Published: 02 Mar 2026, Last Modified: 30 Mar 2026ReALM-GEN 2026 - ICLR 2026 WorkshopEveryoneRevisionsCC BY 4.0

Keywords: mrna, sequence, generation, reward, design, rl, diffusion

TL;DR: We show that conditional, multi-property design of full-length mRNA sequences using reward-guided discrete diffusion trajectories yields wholesale improvements over existing inference-time methods.

Abstract: High-fitness mRNA design requires the simultaneous optimization of multiple application-critical properties, but the design principles governing this high- dimensional landscape remain poorly understood. This motivates the need for methods that provide systematic, controllable generation of mRNA sequences. To address this, we introduce T3PO-mRNA, a framework for computing reward- guided discrete diffusion trajectories that iteratively construct increasingly accurate approximations of the Pareto frontier. Our approach leverages tree search to identify high-reward sequence trajectories and uses these trajectories to fine-tune diffusion models on progressively stronger sequence buffers. We demonstrate that T3PO-mRNA effectively designs therapeutic mRNAs with optimized half-life and translation efficacy, enabling both improved multi-objective performance and efficient inference-time sampling over prior inference-time guidance methods.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 16

Loading