Abstract: Epitope-based vaccines are promising therapeutic modalities for infectious diseases and cancer, but identifying immunogenic epitopes is challenging. Most prediction methods only use amino acid sequence information, and do not incorporate wide-scale structure data and biochemical properties across each peptide–major histocompatibility complex (MHC). We present ImmunoStruct, a deep learning model that integrates sequence, structural and biochemical information to predict multi-allele class I peptide–MHC immunogenicity. By leveraging a multimodal dataset of 26,049 peptide–MHCs, we demonstrate that ImmunoStruct improves immunogenicity prediction performance and interpretability beyond existing methods, across infectious disease epitopes and cancer neoepitopes. We further show strong alignment with in vitro assay results for a set of SARS-CoV-2 epitopes, as well as strong performance in peptide–MHC-based survival prediction for patients with cancer. Overall, this work also presents an architecture that incorporates equivariant graph processing and multimodal data integration for a long-standing challenge in immunotherapy. A multimodal deep learning model combines molecular sequence, structure and biochemical properties to predict immunogenicity in an interpretable way, providing a framework for smarter molecular prediction and hypothesis generation.
External IDs:doi:10.1038/s42256-025-01163-y
Loading