Transfer Learning on Protein Language Models Improves Antimicrobial Peptide Classification

Published: 23 Jun 2025, Last Modified: 23 Jun 2025Greeks in AI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: AI for Science, Protein Language Models, Transfer Learning
TL;DR: We demonstrate that efficiently fine-tuned Protein Language Models can achieve state-of-the-art accuracy in classifying antimicrobial peptides.
Abstract: Antimicrobial peptides (AMPs) are essential components of the innate immune system in humans and other organisms, exhibiting potent activity against a broad spectrum of pathogens. Their potential therapeutic applications, particularly in combating antibiotic resistance, have rendered AMP classification a vital task in computational biology. However, the scarcity of labeled AMP sequences, coupled with the diversity and complexity of AMPs, poses significant challenges for the training of standalone AMP classifiers. Self-supervised learning has emerged as a powerful paradigm in addressing such challenges across various fields, leading to the development of Protein Language Models (PLMs). These models leverage vast amounts of unlabeled protein sequences to learn biologically relevant features, providing transferable protein sequence representations (embeddings), that can be fine-tuned for downstream tasks even with limited labeled data. This study evaluates the performance of several publicly-available PLMs in AMP classification utilizing transfer learning techniques and benchmarking them against state-of-the-art neural-based classifiers. Our key findings include: (a) Model scale is crucial, with classification performance consistently improving with increasing model size; (b) State-of-the-art results are achieved with minimal effort utilizing PLM embedding representations alongside shallow classifiers; and (c) Classification performance is further enhanced through efficient fine-tuning of PLMs’ parameters. Preprint: https://www.researchsquare.com/article/rs-5768912/v1 Under review at: Nature Scientific Reports Keywords: AI for Science, Protein Language Models, Antimicrobial Peptide Classification, Transfer Learning
Submission Number: 115
Loading