Folding scFv--Antigen Complexes at Scale

Ravi Shah; Jeffrey Ouyang-Zhang; Zachary Cohen; Maria Rosaria Briglia; Chi Zhang; Adam Klivans; Daniel Jesus Diaz

Folding scFv--Antigen Complexes at Scale

Ravi Shah, Jeffrey Ouyang-Zhang, Zachary Cohen, Maria Rosaria Briglia, Chi Zhang, Adam Klivans, Daniel Jesus Diaz

Published: 02 Mar 2026, Last Modified: 16 Apr 2026GEM 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: antibodies, scFv, DockQ, confidence metrics, AlphaFold

TL;DR: We benchmark modern cofolding models on 3,800 scFv–antigen complexes generating ~200k predictions and show that correct interfaces are often present but rarely selected, with inference-time choices and confidence metrics driving performance.

Abstract: Accurate modeling of antibody–antigen (Ab–Ag) complexes is central to biologic development, yet the reliability and failures of modern Ab–Ag folding pipelines remain poorly characterized. Single-chain variable fragments (scFvs) are therapeutically important antibodies, but large-scale evaluations of structure prediction models on scFv–Ag complexes are largely lacking. We introduce a scalable benchmarking pipeline that generates large ensembles of scFv–Ag structure predictions by cofolding a curated subset of 3,800 Ab–Ag complexes from SAbDab using multiple state-of-the-art models under diverse inference-time settings. The resulting dataset, **SCALE** (**sc**Fv–**A**g **C**omp**L**ex **E**nsembles) includes standardized scFv–Ag sequences and around 200,000 predicted complexes spanning different models, sampling strategies, and auxiliary inputs. Using **SCALE**, we evaluate model performance in recovering correct scFv–Ag interfaces and assess the ability of existing confidence metrics to select the best structure from prediction ensembles. We find that while confidence scores effectively distinguish easy from hard scFv–Ag complexes, they often fail to identify the highest-quality interface for a given target. Further analysis shows that near-correct interfaces typically appear in ensembles but at low frequency, and inference-time choices like sampling, recycling, and using evolutionary or structural information are crucial for accurate scFv–Ag complex predictions. Dataset and analysis code are available at https://huggingface.co/datasets/ravishah1/SCALE

Submission Number: 81

Loading