To Optimize, Not to Invent: RNAGenScape for mRNA Sequence Generation and Optimization Without de novo Design

Published: 11 Jun 2025, Last Modified: 18 Jul 2025GenBio 2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: biological sequence optimization, mRNA design, manifold learning, Langevin dynamics
TL;DR: We propose RNAGenScape to refine real mRNA sequences using property-guided Langevin dynamics in latent space, constrained by a denoising manifold projector for biological plausibility, outperforming generative and optimization baselines.
Abstract: Designing mRNA sequences with optimized biological properties remains a fundamental challenge in synthetic biology and therapeutic development. Deep generative models have enabled data driven sequence generation, but most are designed for de novo generation, meaning generating entirely from scratch. This nature makes it difficult for them to refine existing sequences, interpolate between sequences, or produce interpretable optimization steps. In this work, we introduce RNAGenScape, a framework for mRNA design that combines Langevin dynamics with a learned manifold projector. Operating entirely in the latent space of a pretrained encoder, RNAGenScape updates latent codes using property guided gradients and then projects each noisy step back onto the learned manifold to ensure biological plausibility. This approach enables property guided optimization, smooth interpolation between arbitrary sequences, and tracking of interpretable latent trajectories, all without requiring explicit density estimation or score learning. We demonstrate on real zebrafish mRNA datasets that RNAGenScape can continuously steer sequences toward target properties while remaining close to natural sequences, and can generate biologically plausible intermediate variants along each trajectory. Our results establish a scalable and generalizable paradigm for controllable mRNA design and latent space exploration in biological sequence modeling.
Submission Number: 115
Loading