Abstract: Accurate protein representations that integrate sequence and three-dimensional (3D) structure are critical to many biological and biomedical tasks. Most existing models either ignore structure or combine it with sequence through a single, static fusion step. Here we present FusionProt, a unified model that learns representations via iterative, bidirectional fusion between a protein language model and a structure encoder. A single learnable token serves as a carrier, alternating between sequence attention and spatial message passing across layers. FusionProt is evaluated on Enzyme Commission (EC), Gene Ontology (GO), and mutation stability prediction tasks. It improves F\textsubscript{max} by a median of $+1.3$ points (up to $+2.0$) across EC and GO benchmarks, and boosts AUROC by $+3.6$ points over the strongest baseline on mutation stability. Inference cost remains practical, with only $\sim2\text{--}5\%$ runtime overhead.
Beyond state-of-the-art performance, we further demonstrate FusionProt’s practical relevance through representative biological case studies, indicating that the model captures biologically relevant features.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Wei_Liu3
Submission Number: 5752
Loading