Multi-Modal and Multi-Task Transformer for Small Molecule Drug Discovery

Published: 17 Jun 2024, Last Modified: 16 Jul 2024ML4LMS PosterEveryoneRevisionsBibTeXCC BY-SA 4.0
Keywords: Drug discovery, Transformers, Multi-modal, Multi-task, Molecular property prediction
Abstract: We introduce a 1B-parameter transformer model pre-trained from scratch on 2.25T tokens from a massive mixture of datasets centered around drug discovery. These datasets are heterogeneous, coming from dozens of sources and covering 15 data modalities. We demonstrate the model’s capability on various molecular assay prediction tasks, including public benchmarks and internally generated holdouts from real-world drug discovery programs. Following parameter-efficient fine-tuning, the multi-modal transformer excels at multi-task predictions compared to strong molecular property prediction baselines including XGBoost and Chemprop.
Supplementary Material: pdf
Submission Number: 110
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview