$\texttt{RNAGenScape}$: Property-Guided Optimization and Interpolation of mRNA Sequences with Manifold Langevin Dynamics
Keywords: biological sequence optimization
TL;DR: A dedicated and versatile mRNA sequence optimization and interpolation method.
Abstract: mRNA design and optimization are important in synthetic biology and therapeutic development, but remain understudied in machine learning. Systematic optimization of mRNAs is hindered by the scarce and imbalanced data as well as complex sequence-function relationships. We present $\texttt{RNAGenScape}$, a property-guided manifold Langevin dynamics framework that iteratively updates mRNA sequences within a learned latent manifold. $\texttt{RNAGenScape}$ combines an organized autoencoder, which structures the latent space by target properties for efficient and biologically plausible exploration, with a manifold projector that contracts each step of update back to the manifold. $\texttt{RNAGenScape}$ supports property-guided optimization and smooth interpolation between sequences, while remaining robust under scarce and undersampled data, and ensuring that intermediate products are close to the viable mRNA manifold. Across three real mRNA datasets, $\texttt{RNAGenScape}$ improves the target properties with high success rates and efficiency, outperforming various generative or optimization methods developed for proteins or non-biological data. By providing continuous, data-aligned trajectories that reveal how edits influence function, $\texttt{RNAGenScape}$ establishes a scalable paradigm for controllable mRNA design and latent space exploration in mRNA sequence modeling.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 930
Loading