Proteo-R1: Reasoning Foundation Models for De Novo Antibody Design

Published: 28 May 2026, Last Modified: 30 May 2026GenBio 2026 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Biology Foundation Models, Drug Discovery, LLM Reasoning
Abstract: Deep learning in de novo protein design has achieved atomic-level fidelity. However, existing models remain largely non-deliberative: they directly synthesize molecular geometries without explicitly reasoning about which residues or interactions are functionally essential. As a result, design decisions are entangled with continuous sampling dynamics, limiting interpretability, controllability, and systematic reuse of biochemical knowledge. We introduce Proteo-R1, a reasoning-guided protein design framework that explicitly decouples molecular understanding from geometric generation. Proteo-R1 adopts a dual-expert architecture, in which a multimodal large language model (LLM) serves as an understanding expert and analyzes protein sequences, structures, and textual context to identify key functional residues that govern binding and specificity. These residue-level decisions are then passed to a separate diffusion-based generation expert, which performs conditional co-design while respecting the fixed interaction anchors. This factorization mirrors how human experts approach molecular engineering: first, reasoning about critical interactions, then optimizing geometry subject to those constraints. By operationalizing reasoning as explicit residue-level commitments rather than latent textual guidance, Proteo-R1 achieves stable, interpretable, and modular integration of LLM reasoning with advanced geometric generative models. Code and demos are at https://smiles724.github.io/r1/.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 5
Loading