Fusion-pMT: Biological Language Modeling for Tri-Molecular Binding in Immunogenicity Prediction

TMLR Paper6182 Authors

12 Oct 2025 (modified: 13 Oct 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Recent advancements in multimodal techniques and large language models (LLMs) offer a new perspective on handling biological sequences through biological language modeling. One particularly critical yet underexplored challenge lies in modeling the tripartite interaction among peptide, MHC, and TCR---an essential step in understanding T cell-mediated immunity and improving immunogenicity prediction. In this paper, we propose Fusion-pMT, a biological language modeling framework that (1) learns unified representations of the three molecular inputs by leveraging their common structure as amino acid sequences, and (2) fuses the representations of each sequence to enable interaction among heterogeneous molecular inputs, aligning with the stepwise nature of immune recognition. Built on this foundation, Fusion-pMT effectively supports both pairwise and tripartite interaction modeling among peptide, MHC, and TCR. Moreover, its parameter-sharing design reduces memory usage during inference, making it lightweight and practical for biological applications.To validate its effectiveness, we conduct comprehensive experiments covering both pairwise and tripartite interactions (including out-of-distribution evaluation) and demonstrate that Fusion-pMT consistently outperforms state-of-the-art baselines across all the benchmarks.
Submission Type: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=lpsb71iIju
Changes Since Last Submission: Revised the non-anonymous questions and redundant PDFs in the supplementary documents
Assigned Action Editor: ~Quanquan_Gu1
Submission Number: 6182
Loading