Keywords: RNA design, Diffusion models, Generative models
Abstract: RNA molecules underlie regulation, catalysis, and therapeutics in biological systems, yet de novo RNA design remains difficult with the tight and highly non-linear sequence–structure coupling.
The RNA sequence–structure co-design problem generates nucleotide sequences and 3D conformations jointly, which is challenging due to RNA’s conformational flexibility, non-canonical base pairing, and the scarcity of 3D data.
We introduce a joint generative framework that embeds RoseTTAFold2NA as the denoiser into a dual diffusion model, injecting rich cross-molecular priors while enabling sample-efficient learning from limited RNA data. Our method couples a discrete diffusion process for sequences with an $SE(3)$-equivariant diffusion for rigid-frame translations and rotations over all-atom coordinates. The architecture supports flexible conditioning,
and is further enhanced at inference via lightweight RL techniques that optimize task-aligned rewards.
Across de novo RNA design as well as complex and protein-conditioned design tasks, our approach yields high self-consistency and confidence scores, improving over recent diffusion/flow baselines trained from scratch. Results demonstrate that leveraging pre-trained structural priors within a joint diffusion framework is a powerful paradigm for RNA design under data scarcity, enabling high-fidelity generation of standalone RNAs and functional RNA–protein interfaces.
Primary Area: generative models
Submission Number: 21266
Loading