ProteinAdapter: Adapting Pre-trained Large Protein Models for Efficient Protein Representation Learning
The study of proteins is crucial in various scientific disciplines, but understanding their intricate multi-level relationships remains challenging. Recent advancements in Large Protein Models (LPMs) have demonstrated their ability in sequence and structure understanding, suggesting the potential of directly using them for efficient protein representation learning. In this work, we introduce ProteinAdapter, to efficiently transfer the general reference from the multiple Large Protein Models (LPMs), e.g., ESM-1b, to the task-specific knowledge. ProteinAdapter could largely save labor-intensive analysis on the 3D position and the amino acid order. We observe that such a simple yet effective approach works well on multiple downstream tasks. Specifically, (1) with limited extra parameters, ProteinAdapter enables multi-level protein representation learning by integrating both sequence and geometric structure embeddings from LPMs. (2) Based on the learned embedding, we further scale the proposed ProteinAdapter to multiple conventional protein tasks. Considering different task priors, we propose a unified multi-scale predictor to fully take advantage of the learned embeddings via task-specific focus. Extensive experiments on over 20 tasks show that ProteinAdapter outperforms state-of-the-art methods under both single-task and multi-task settings. We hope that the proposed method could accelerate the study of protein analysis in the future.