Multimodal Distillation of Protein Sequence, Structure, and Function

Bozhen Hu; Cheng Tan; Bin Gao; Zhangyang Gao; Lirong Wu; Jun Xia; Stan Z. Li

Multimodal Distillation of Protein Sequence, Structure, and Function

Bozhen Hu, Cheng Tan, Bin Gao, Zhangyang Gao, Lirong Wu, Jun Xia, Stan Z. Li

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: protein sequence, structure, function, knowledge distillation, representation learning

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Proteins are the fundamental building blocks of life, carrying out essential biological functions in biology. Learning effective representations of proteins is critical for important applications like drug design and function prediction. Language models (LMs) and graph neural networks (GNNs) have shown promising performance for modeling proteins. However, multiple data modalities exist for proteins, including sequence, structure, and functional annotations. Frameworks integrating these diverse sources without large-scale pre-training remain underdeveloped. In this work, we propose ProteinSSA, a multimodal knowledge distillation framework to incorporate {\bf Protein} {\bf S}equence, {\bf S}tructure, and Gene Ontology (GO) {\bf A}nnotation for unified representations. Our approach trains a teacher and student model connected via distillation. The student GNN encodes protein sequences and structures, while the teacher model leverages GNN and an auxiliary GO encoder to incorporate the functional knowledge, generating hybrid multimodal embeddings passed to the student to learn the function-enriched representations by distribution approximation. Experiments on tasks like protein fold and enzyme commission (EC) prediction show that ProteinSSA significantly outperforms state-of-the-art baselines, demonstrating the benefits of our multimodal framework.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: zip

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3403

Loading