MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants

Published: 18 Sept 2025, Last Modified: 30 Oct 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: native multimodal, biomedical assistant, instruction tuning
TL;DR: We present MedMax, a large-scale multimodal biomedical instruction-tuning dataset for mixed-modal foundation models, and show that our data achieves superior performance than GPT-4o on diverse biomedical tasks.
Abstract: Recent advancements in mixed-modal generative have opened new avenues for developing unified biomedical assistants capable of analyzing biomedical images, answering complex questions about them, and generating multimodal patient reports. However, existing datasets face challenges such as small sizes, limited coverage of biomedical tasks and domains, and a reliance on narrow sources. To address these gaps, we present MedMax, a large-scale multimodal biomedical instruction-tuning dataset for mixed-modal foundation models. With 1.47 million instances, MedMax encompasses a diverse range of tasks, including interleaved image-text generation, biomedical image captioning and generation, visual chat, and report understanding. These tasks span knowledge across diverse biomedical domains, including radiology and histopathology, grounded in medical papers and YouTube videos. Subsequently, we fine-tune a mixed-modal foundation model on the MedMax dataset, achieving significant performance improvements: a 26% gain over the Chameleon model and an 18.3% improvement over GPT-4o across 12 downstream biomedical visual question-answering tasks. Finally, we introduce a unified evaluation suite for biomedical tasks to guide the development of mixed-modal biomedical AI assistants. We release the code, data, and model at https://mint-medmax.github.io/.
Croissant File: json
Dataset URL: https://huggingface.co/datasets/mint-medmax/medmax_data
Code URL: https://github.com/Hritikbansal/medmax
Primary Area: AL/ML Datasets & Benchmarks for health sciences (e.g. climate, health, life sciences, physics, social sciences)
Flagged For Ethics Review: true
Submission Number: 616
Loading