Keywords: Geometric algebra, Bayesian flow network, RNA inverse design
TL;DR: More accurate RNA inverse design model
Abstract: With the development of biotechnology, RNA therapies have shown great potential.
However, different from proteins, the sequences corresponding to a single RNA three-dimensional structure are more abundant. Most of the existing RNA design methods merely take into account the secondary structure of RNA, or are only capable of generating a limited number of candidate sequences.
To address these limitations, we propose a geometric-algebra-enhanced $\textbf{B}$ayesian $\textbf{F}$low $\textbf{N}$etwork for the inverse design of $\textbf{R}$NA, called $\textbf{RBFN}$. RBFN uses a Bayesian Flow Network to model the distribution of nucleotide sequences in RNA, enabling the generation of more reasonable RNA sequences. Meanwhile, considering the more flexible characteristics of RNA conformations, we utilize geometric algebra to enhance the modeling ability of the RNA three-dimensional structure, facilitating a better understanding of RNA structural properties.
In addition, due to the scarcity of RNA structures and the limitation that there are only four types of nucleic acids, we propose a new time-step distribution sampling to address the scarcity of RNA structure data and the relatively small number of nucleic acid types. Evaluation on the single-state fixed-backbone re-design benchmark and multi-state fixed-backbone benchmark indicates that RBFN can outperform existing RNA design methods in various RNA design tasks, enabling effective RNA sequence design.
Primary Area: Machine learning for sciences (e.g. climate, health, life sciences, physics, social sciences)
Submission Number: 10757
Loading