Keywords: Klebsiella, Machine Learning, depolymerase, phage
TL;DR: Machine learning model to predict depolymerases
Abstract: Antimicrobial resistance (AMR) has been declared a global threat by the World Health Organization. Development of novel and effective therapies against microbes is an active research area of ever-growing importance. One of the leading threats are Klebsiella species, which cause virulent AMR infections with high death rates, particularly in hospital settings. Klebsiella species are particularly problematic because they produce a thick sticky polysaccharide capsule that protects them from antimicrobials and allows them to build highly resistant biofilms - defensive layers of cells. A natural solution to eradicate Klebsiella capsules and biofilms are depolymerase proteins that can target and neutralize polysaccharide capsules of specific Klebsiella species, often found in bacteriophages. However, machine learning guided discovery of depolymerase proteins in such phages is an unexplored area. In this work, we use machine learning to help identify proteins in phage proteomes that can act as depolymerases against Klebsiella. Specifically, we utilize a dataset of phages, containing depolymerase proteins, that can target and neutralize polysaccharide capsules of specific Klebsiella species. We train a ranking model to rank proteins in an input phage proteome based on their predicted ability to act as a depolymerase. We use a non-redundant validation protocol to evaluate the predictive accuracy of the proposed model. Our analysis shows that for all test proteomes containing at least one depolymerase, the depolymerase protein was ranked within the top scoring 5% of proteins. We expect that the proposed approach (called Depolymerase Ranker) will be useful in accelerating the discovery of such antibacterial proteins in the wet lab.