A Comprehensive Survey of Multimodal LLMs for Scientific Discovery

Published: 12 Nov 2025, Last Modified: 12 Nov 2025VLM4RWD2025 DemoEveryoneRevisionsBibTeXCC BY 4.0
Track: Demo papers (2-4 pages)
Keywords: MLLMs, AI for Science, Survey, drug discovery, molecular & protein design, materials science, genomics.
TL;DR: A survey work to systematically review the progress of MLLMs in key scientific domains, including drug discovery, molecular & protein design, materials science, and genomics.
Abstract: Recent advances in artificial intelligence (AI), especially large language models, have accelerated the integration of multimodal data in scientific research. Given that scientific fields involve diverse data types, ranging from text and images to complex biological sequences and structures, multimodal large language models~(MLLMs) have emerged as powerful tools to bridge these modalities, enabling more comprehensive data analysis and intelligent decision-making. This work, $\text{S}^3\text{-Bench}$, provides a comprehensive overview of recent advances in MLLMs, focusing on their diverse applications across science. We systematically review the progress of MLLMs in key scientific domains, including drug discovery, molecular \& protein design, materials science, and genomics. The work highlights model architectures, domain-specific adaptations, benchmark datasets, and promising future directions. More importantly, we benchmarked open-source MLLMs on a range of critical small molecular and protein property prediction tasks. Our work aims to serve as a valuable resource for both researchers and practitioners interested in the rapidly evolving landscape of multimodal AI for science.
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 3
Loading