MM-InstructEval: Zero-shot evaluation of (Multimodal) Large Language Models on multimodal reasoning tasks

Published: 01 Jan 2025, Last Modified: 21 May 2025Inf. Fusion 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose MM-InstructEval with metrics for efficacy, robustness, and adaptability.•We evaluate 45 models on 16 datasets with 10 instructions for 6 multimodal tasks.•We assess LLMs and MLLMs, revealing insights into performance on multimodal tasks.
Loading