ImmunoFoundation: A Multimodal Foundation Model for Immunogenicity Prediction and Peptide Optimization
Keywords: foundation models, immunogenicity, AI4Science, AI4Cancer, computational biology
TL;DR: We introduce ImmunoFoundation a multimodal foundation model for immunogenicity prediction that pretrains on folded protein complexes.
Abstract: Peptide immunogenicity, whether a peptide presented by an MHC molecule elicits a T-cell response, is central to designing vaccines, cancer immunotherapy, and therapeutic proteins. Existing tools rely on a single modality, such as peptide sequences or peptide-MHC interactions, and often ignore the T-cell response that depends on the TCR-peptide-MHC complex (TCR-pMHC) and its three-dimensional structure. The scarcity of labeled TCR-pMHC data with known structures makes it difficult to build a model that captures how all components of the TCR-pMHC contribute to immunogenicity. However, a foundation model of TCR-pMHCs can learn transferable representations across components, which can be adapted to immunogenicity, binding, and TCR specificity tasks, even with limited labeled data. We introduce **ImmunoFoundation**, a self-supervised multimodal backbone for protein-complex representation, fine-tuned for peptide--MHC immunogenicity. The model couples an ESM-2 sequence encoder with a graph transformer over structure, fused via cross-modal attention. Pretraining follows a curriculum that progressively introduces structural inductive bias. **ImmunoFoundation** ourperforms prior multimodal class-I predictors on cancer neoepitope and infectious-disease tasks.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 64
Loading