PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding

Minghao Xu; Zuobai Zhang; Jiarui Lu; Zhaocheng Zhu; Yangtian Zhang; Chang Ma; Runcheng Liu; Jian Tang

PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding

Minghao Xu, Zuobai Zhang, Jiarui Lu, Zhaocheng Zhu, Yangtian Zhang, Chang Ma, Runcheng Liu, Jian Tang

Published: 17 Sept 2022, Last Modified: 04 Aug 2025NeurIPS 2022 Datasets and Benchmarks Readers: Everyone

Keywords: Protein Modeling Benchmark, Protein Sequence Understanding, Multi-Task Learning

TL;DR: This work proposes a comprehensive and multi-task benchmark for protein sequence understanding, which studies both single-task and multi-task learning.

Abstract: We are now witnessing significant progress of deep learning methods in a variety of tasks (or datasets) of proteins. However, there is a lack of a standard benchmark to evaluate the performance of different methods, which hinders the progress of deep learning in this field. In this paper, we propose such a benchmark called PEER, a comprehensive and multi-task benchmark for Protein sEquence undERstanding. PEER provides a set of diverse protein understanding tasks including protein function prediction, protein localization prediction, protein structure prediction, protein-protein interaction prediction, and protein-ligand interaction prediction. We evaluate different types of sequence-based methods for each task including traditional feature engineering approaches, different sequence encoding methods as well as large-scale pre-trained protein language models. In addition, we also investigate the performance of these methods under the multi-task learning setting. Experimental results show that large-scale pre-trained protein language models achieve the best performance for most individual tasks, and jointly training multiple tasks further boosts the performance. The datasets and source codes of this benchmark will be open-sourced soon.

Supplementary Material: pdf

Dataset Embargo: All datasets are available in the TorchDrug library (https://github.com/DeepGraphLearning/torchdrug/).

License: All source code of the PEER benchmark are licensed under the Apache License 2.0.

Author Statement: Yes

URL: https://github.com/DeepGraphLearning/PEER_Benchmark

Contribution Process Agreement: Yes

In Person Attendance: Yes

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/peer-a-comprehensive-and-multi-task-benchmark/code)

25 Replies

Loading