GRAM-DTI: Adaptive Multimodal Representation Learning for Drug–Target Interaction Prediction

ICLR 2026 Conference Submission19306 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Drug-target interaction prediction, Multimodal representation learning, Adaptive modality dropout
Abstract: Drug target interaction (DTI) prediction is a cornerstone of computational drug discovery, enabling rational design, repurposing, and mechanistic insights. While deep learning has advanced DTI modeling, existing approaches primarily rely on SMILES–protein pairs and fail to exploit the rich multimodal information available for small molecules and proteins. Inspired by recent successes in multimodal molecular property prediction, we introduce GRAM-DTI, a pre-training framework that integrates multimodal small molecule and protein inputs into a unified representation. GRAM-DTI extends volume-based contrastive learning to four modalities, capturing higher-order semantic alignment beyond conventional pairwise approaches. To handle modality informativeness, we propose adaptive modality dropout, dynamically regulating each modality’s contribution during pretraining. Additionally, IC50 activity measurements, when available, are incorporated as weak supervision to ground representations in biologically meaningful interaction strengths. Experiments on four publicly available datasets demonstrate that GRAM-DTI consistently outperforms state-of-the-art baselines. Our results highlight the benefits of higher-order multimodal alignment, adaptive modality utilization, and auxiliary supervision for robust and generalizable DTI prediction.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 19306
Loading