LiRDSC: Ligand-Conditioned RNA Sequence Design via Diffusive Structural Conditioning

17 Sept 2025 (modified: 26 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: RNA design, RNA-Ligand, Structure
Abstract: Designing RNA sequences that bind specific small-molecule ligands is a central goal in molecular engineering. However, existing computational methods face two persistent challenges: extreme sensitivity to imperfections in input tertiary structure scaffolds, and a tendency toward \textit{mode collapse}, where models generate generic, non-specific sequences rather than ligand-tailored designs. To address this, we present two core contributions. First, we introduce \textbf{RLData2400}, a benchmark dataset combining high-resolution experimental structures with diverse, high-confidence \textit{in silico} models to facilitate the development of models. Second, we propose \textbf{LiRDSC (Ligand-conditioned RNA Design via Diffusive Structural Conditioning)}, a deep generative framework architected for specificity. LiRDSC uniquely employs a \textbf{Diffusive Structural Encoder (DSE)}, which learns resilient representations by training on noise-perturbed structures, and a \textbf{Ligand-Contextual FiLM Conditioner (LCFC)} that steers the model to reason about the ligand’s identity, preventing mode collapse. Trained on our dataset (RLData2400), LiRDSC not only achieves high sequence recovery but also generates a diverse range of ligand-target RNA sequences. Crucially, its superiority is most pronounced on structurally augmented data, directly validating the robustness imparted by our diffusion-based conditioning. Inverse folding experiments further confirm that the generated sequences accurately recapitulate their target tertiary structures. Importantly, computational analysis predicts strong binding compatibility with the intended ligands, demonstrating LiRDSC’s ability to produce RNA candidates that are both structurally viable and ligand-specific.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 8371
Loading