ProteinAdapter: Adapting Pre-trained Large Protein Models for Efficient Protein Representation Learning

Chao Wang; Zhedong Zheng; Yifan Sun; Hehe Fan; Yi Yang

ProteinAdapter: Adapting Pre-trained Large Protein Models for Efficient Protein Representation Learning

Chao Wang, Zhedong Zheng, Yifan Sun, Hehe Fan, Yi Yang

24 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Pretrained Large Models, Parameter-Efficient Fine-tuning, Protein Representation Learning

Abstract:

The study of proteins is crucial in various scientific disciplines, but understanding their intricate multi-level relationships remains challenging. Recent advancements in Large Protein Models (LPMs) have demonstrated their ability in sequence and structure understanding, suggesting the potential of directly using them for efficient protein representation learning. In this work, we introduce ProteinAdapter, to efficiently transfer the general reference from the multiple Large Protein Models (LPMs), e.g., ESM-1b, to the task-specific knowledge. ProteinAdapter could largely save labor-intensive analysis on the 3D position and the amino acid order. We observe that such a simple yet effective approach works well on multiple downstream tasks. Specifically, (1) with limited extra parameters, ProteinAdapter enables multi-level protein representation learning by integrating both sequence and geometric structure embeddings from LPMs. (2) Based on the learned embedding, we further scale the proposed ProteinAdapter to multiple conventional protein tasks. Considering different task priors, we propose a unified multi-scale predictor to fully take advantage of the learned embeddings via task-specific focus. Extensive experiments on over 20 tasks show that ProteinAdapter outperforms state-of-the-art methods under both single-task and multi-task settings. We hope that the proposed method could accelerate the study of protein analysis in the future.

Primary Area: representation learning for computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9155

Loading