Protap: A Benchmark for Protein Modeling on Realistic Downstream Applications

Protap: A Benchmark for Protein Modeling on Realistic Downstream Applications

ICLR 2026 Conference Submission16660 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: protein representation learning, protein downstream application, protein benchmark

TL;DR: We introduce Protap, a benchmark for evaluating protein modeling across diverse realistic tasks.

Abstract: Recently, extensive deep learning architectures and pretraining strategies have been explored to support downstream protein applications. Additionally, domain-specific models incorporating biological knowledge have been developed to enhance performance in specialized tasks. In this work, we introduce \textbf{Protap}, a comprehensive benchmark that systematically compares backbone architectures, pretraining strategies, and domain-specific models across diverse and realistic downstream protein applications. Specifically, Protap covers five applications: three general tasks and two novel specialized tasks, i.e., enzyme-catalyzed protein cleavage site prediction and targeted protein degradation, which are industrially relevant yet missing from existing benchmarks. For each application, Protap compares various domain-specific models and general architectures under multiple pretraining settings. Our empirical studies imply that: (i) Though large-scale pretraining encoders achieve great results, they often underperform supervised encoders trained on small downstream training sets. (ii) Incorporating structural information during downstream fine-tuning can match or even outperform protein language models pretrained on large-scale sequence corpora. (iii) Domain-specific biological priors can enhance performance on specialized downstream tasks. Code is publicly available at https://anonymous.4open.science/r/protap-1CC5.

Primary Area: datasets and benchmarks

Submission Number: 16660

Loading