Equivariant Protein Multi-task Learning

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Equivariant neural network, Multitask learning, Heterogeneous graph
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: In this work, we proposed a multitask dataset called PROMPT, as well as a Heterogeneous Multi-channel Equivariant Network for protein tasks, achieving the best results in PROMPT in various tasks.
Abstract: Understanding and leveraging the 3D structures of proteins is central to various tasks in biology and drug discovery. While deep learning has been applied successfully for modeling protein structures, current methods usually employ distinct models for different tasks. Such a single-task strategy is not only resource-consuming when the number of tasks increases but also incapable of combining multi-source datasets for larger-scale model training, given that protein datasets are usually of small size for most structural tasks. In this paper, we propose to adopt one single model to address multiple tasks jointly, upon the input of 3D protein structures. In particular, we first construct a standard multi-task benchmark called PROMPT, consisting of 6 representative tasks integrated from 4 public datasets. The resulting benchmark contains partially labeled data for training and fully-labeled data for validation/testing. Then, we develop a novel graph neural network for multi-task learning, dubbed Heterogeneous Multichannel Equivariant Network (HeMeNet), which is equivariant to 3D rotations/translations/reflections of proteins and able to capture various relationships between different atoms owing to the heterogeneous multichannel graph construction of proteins. Besides, HeMeNet is able to achieve task-specific learning via the task-aware readout mechanism. Extensive evaluations verify the effectiveness of multi-task learning on our benchmark, and our model generally surpasses state-of-the-art models. Our study is expected to open up a new venue for structure-based protein learning.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3556
Loading