Efficient Parameter Tuning of Large Protein Language Models for De Novo Protein Design

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: De novo protein design; protein language model; prefix tuning.
Abstract: Protein language models (ProtLMs) have achieved unprecedented breakthroughs in protein design. However, optimizing ProtLMs effectively with limited data has been challenging due to their large number of parameters. In this study, we introduce prefix tuning to efficiently prompt the pre-trained ProtLMs for de novo protein design with desired structures and functions. During the training process, only the prefix virtual token is trainable, while the pre-trained ProtLM is frozen. We trained two prefix virtual tokens on antimicrobial peptide (AMP) dataset and alpha-helix strucutre dataset, respectively. Our results demonstrate that prefix tuning is efficient to prompt the pre-trained ProtLM by optimizing fewer trainable parameters to achieve superior results compared with fine tuning, even under low-data settings. Furthermore, these two prefix virtual tokens can be combined to precisely control protein generation with both desired properties, which is not possessed by other tuning methods. We anticipate that prefix tuning will contribute to the protein discovery and biomedical advancement.
Supplementary Material: pdf
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6673
Loading