Estimating the Limits of Organism-Specific Training for Epitope Prediction

Published: 01 Jan 2023, Last Modified: 03 Jul 2025BIBM 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The identification of linear B-cell epitopes is an important task in the development of vaccines, therapeutic antibodies and several diagnostic tests. Recently, organism-specific training has been shown to improve prediction performance for data-rich organisms. This article investigates the limits of organism-specific training for epitope prediction, by systematically quantifying the effect of the amount of training data on the performance of the models developed. The results obtained indicate that even models trained on small organism-specific data sets can outperform similar models trained on much larger heterogeneous and mixed data sets, as well as widely-used predictors from the literature, which are trained on heterogeneous data. These results suggest the potential for a much broader applicability of pathogen-specific models, which can be used to accelerate the development of diagnostic tests and vaccines in the context of emerging pathogens and to support faster responses in future disease outbreaks.
Loading