MultiAbRank: Benchmarking De Novo Antibody Design Under Multi-Objective Constraints

Published: 23 May 2026, Last Modified: 13 Jun 2026SD4H ICML 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: De Novo Antibody Design, Language Models, Diffusion Models, Protein Language Models, Multi-task Evaluation, Benchmarking
TL;DR: We present MultiAbRank, a multi-objective benchmark showing that state-of-the-art antibody design models collapse under realistic joint constraints, exposing a critical gap between structural plausibility and true binding performance.
Abstract: In recent years, diffusion and sequence-first models have been created to rapidly produce antibody candidates. Through physics-based scoring metrics that analyze features such as structural stability and interface geometry, methods for benchmarking and evaluating these candidates have also emerged. Proper evaluation of antibody candidates and computational design models requires examining binding, structural plausibility, biological realism, and novelty. However, de novo antibody evaluation methods rely on reconstruction based criteria or structural confidence proxies, and do not satisfy these objectives, failing to properly address the underlying challenge in de novo antibody design that plagues in vitro and in vivo translation. We introduce a benchmark for de novo antibody design models with three complementary tasks, examine current models on supported tasks, and evaluate candidates produced from these three tasks based on a multi-objective scoring method. We also expose weaknesses in antibody candidate evaluation, offer a standardized framework for future computational design models.
Submission Number: 114
Loading