PEPTRIX: A FRAMEWORK FOR EXPLAINABLE PEPTIDE ANALYSIS THROUGH PROTEIN LANGUAGE MODELS

18 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Representation Learning, Computational Biology, Protein Language Models, Machine Learning, XAI, Contrastive Training, Graph Attention Network
TL;DR: PepTriX is a framework for explainable peptide analysis that uses a graph attention network to combine 1D sequence embeddings with 3D structural features, enabling robust, interpretable, and generalizable predictions for peptide classification tasks.
Abstract: Peptide classification tasks, such as predicting toxicity and HIV inhibition, are fundamental to bioinformatics and drug discovery. Traditional approaches rely heavily on handcrafted encodings of one-dimensional (1D) peptide sequences, which can limit generalizability across tasks and datasets. Recently, protein language models (PLMs), such as ESM-2 and ESMFold, have demonstrated strong predictive performance. However, they face two critical challenges. First, fine-tuning is computationally costly. Second, their complex latent representations hinder interpretability for domain experts. Additionally, many frameworks have been developed for specific types of peptide classification, lacking generalization. These limitations restrict the ability to connect model predictions to biologically relevant motifs and structural properties. To address these limitations, we present PepTriX, a novel framework that integrates one dimensional (1D) sequence embeddings and three-dimensional (3D) structural features via a graph attention network enhanced with contrastive training and cross-modal co-attention. PepTriX automatically adapts to diverse datasets, producing task-specific peptide vectors while retaining biological plausibility. After evaluation by domain experts, we found that PepTriX performs remarkably well across multiple peptide classification tasks and provides interpretable insights into the structural and biophysical motifs that drive predictions. Thus, PepTriX offers both predictive robustness and interpretable validation, bridging the gap between performance-driven peptide-level models (PLMs) and domain-level understanding in peptide research.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Supplementary Material: zip
Submission Number: 12050
Loading