AsEP: Benchmarking Deep Learning Methods for Antibody-specific Epitope Prediction

Published: 17 Jun 2024, Last Modified: 17 Jul 2024ICML2024-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Antibody; Epitope; Benchmark; Graph Neural Networks; Protein Language Models;
TL;DR: We provide a novel dataset and benchmarked recent methods on the task of epitope prediction and formulate the question as a bipartite graph link prediciton task.
Abstract: Epitope identification is vital for antibody design yet challenging due to the inherent variability in antibody. Additionally, the challenge is heightened by the lack of a consistent evaluation pipeline, limited dataset size and epitope diversity. Our contributions are two-fold. First, we provide the largest specialized epitope prediction dataset -- AsEP, consisting of $1723$ filtered antibody-antigen complexes. AsEP addressed the dataset diversity issue with clustered epitope groups. Second, most current methods for epitope prediction focus solely on antigen while few consider \textit{both} antibody and antigen. Instead, we conceptualize the antibody-antigen interaction as bipartite graphs and formulate epitope prediction as link prediction tasks. Such formulation allows attributing model prediction to interaction types, providing more interpretability. Our method, WALLE, leverages protein language models for capturing sequence-level information and graph networks for incorporating structure information. WALLE outperforms existing models, achieving an MCC of $0.210$ and roughly six times better than MaSIF-site. The curated dataset AsEP and our method WALLE are available to the research community, fostering open-source collaboration and advancement of the field.
Submission Number: 125
Loading