Folding scFv--Antigen Complexes at Scale
Keywords: antibodies, scFv, DockQ, confidence metrics, AlphaFold
TL;DR: We benchmark modern cofolding models on 3,800 scFv–antigen complexes generating ~200k predictions and show that correct interfaces are often present but rarely selected, with inference-time choices and confidence metrics driving performance.
Abstract: Accurate modeling of antibody–antigen (Ab–Ag) complexes is central to biologic development, yet the reliability and failures of modern Ab–Ag folding pipelines remain poorly characterized. Single-chain variable fragments (scFvs) are therapeutically important antibodies, but large-scale evaluations of structure prediction models on scFv–Ag complexes are largely lacking. We introduce a scalable benchmarking pipeline that generates large ensembles of scFv–Ag structure predictions by cofolding a curated subset of 3,800 Ab–Ag complexes from SAbDab using multiple state-of-the-art models under diverse inference-time settings. The resulting dataset, **SCALE** (**sc**Fv–**A**g **C**omp**L**ex **E**nsembles) includes standardized scFv–Ag sequences and around 200,000 predicted complexes spanning different models, sampling strategies, and auxiliary inputs. Using **SCALE**, we evaluate model performance in recovering correct scFv–Ag interfaces and assess the ability of existing confidence metrics to select the best structure from prediction ensembles. We find that while confidence scores effectively distinguish easy from hard scFv–Ag complexes, they often fail to identify the highest-quality interface for a given target. Further analysis shows that near-correct interfaces typically appear in ensembles but at low frequency, and inference-time choices like sampling, recycling, and using evolutionary or structural information are crucial for accurate scFv–Ag complex predictions. Dataset and analysis code are available at https://huggingface.co/datasets/ravishah1/SCALE
Submission Number: 81
Loading