RAMM: Robust Adversarial Multimodal Learning for Protein Stability Prediction

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal learning, Adversarial deep learning, Representation learning, Protein stability, Mutation prediction
TL;DR: We introduce RAMM, a two-stage adversarial multimodal learning framework that makes protein stability prediction more robust to mutations and small datasets.
Abstract: Multimodal representations that integrate protein sequence and structure offer powerful priors for modeling protein properties, yet adapting them to small, task-specific datasets often leads to overfitting. We present RAMM, a two-stage adversarial multimodal learning framework for predicting the stability effects of protein mutations. In the first stage, we fine-tune a multimodal encoder on large protein datasets to capture general sequence–structure relationships. In the second stage, we train this encoder on target protein stability datasets while jointly optimizing an adversarial objective: a discriminator attempts to distinguish wild-type from mutant proteins, while the encoder learns to produce features robust to wildtype and mutation domain shifts that fool the discriminator. This adversarial game drives the system toward a Nash equilibrium where the learned latent space becomes robust to distributional shifts introduced by mutations. Evaluations on low-sequence-identity benchmarks show that this approach improves generalization, achieving AUROC = 0.763 on the SKEMPI 2.0 classification task and RMSE = 1.39 kcal/mol on the S669 regression benchmark. These results highlight that adversarial deep learning can enhance the robustness of multimodal protein models for challenging biological prediction tasks.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 23471
Loading