Mechanistic Synergy in Multi-Modal VEP: DNA Context Complements PLMs under Biophysical Constraints

Published: 28 May 2026, Last Modified: 03 Jun 2026ICML 2026 FM4LS Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Variant Effect Prediction, Multi-modal Foundation Models, Protein Language Models, DNA Language Models, Mechanistic Synergy
Abstract: While Protein Language Models (PLMs) have advanced Variant Effect Prediction (VEP), they can sometimes overlook the complex physical and regulatory contexts of the cell. To address this limitation, we propose a parameter-efficient multi-modal foundation model architecture that integrates DNA signals from a DNA Language Model (DLM) with protein representations from a PLM. Through a systematic analysis across 29 activity-based Deep Mutational Scanning (DMS) datasets from ProteinGym, we demonstrate that the nucleotide modality provides more than a simple ensemble gain; it specifically corrects PLM failure modes localized to charged residues. Ultimately, this work highlights that moving beyond unimodal representations is essential for capturing the mechanistic complexity of biological systems and for accurately identifying experimentally validated functional hotspots.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 26
Loading