Keywords: mrna, sequence, diffusion, rl, guidance, reward, optimization
TL;DR: We show that conditional, multi-property design of full-length mRNA sequences using reward-guided discrete diffusion trajectories yields wholesale improvements over existing inference-time methods.
Abstract: High-fitness mRNA design requires the simultaneous optimization of multiple
application-critical properties, but the design principles governing this high-
dimensional landscape remain poorly understood. This motivates the need for
methods that provide systematic, controllable generation of mRNA sequences. To
address this, we introduce T3PO-mRNA, a framework for computing reward-
guided discrete diffusion trajectories that iteratively construct increasingly accurate
approximations of the Pareto frontier. Our approach leverages tree search to
identify high-reward sequence trajectories and uses these trajectories to fine-tune
diffusion models on progressively stronger sequence buffers. We demonstrate that
T3PO-mRNA effectively designs therapeutic mRNAs with optimized half-life
and translation efficacy, enabling both improved multi-objective performance and
efficient inference-time sampling over prior inference-time guidance methods
Submission Number: 24
Loading