Benchmarking Multimodal Personalized Reasoning of Vision-Language Models in the Wild

Published: 26 May 2026, Last Modified: 26 May 2026ICML 2026 FoGen Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Personalization, Reasoning, Multimodal Large Language Model
TL;DR: We introduce MPRBench, the first comprehensive benchmark for multimodal personalized reasoning, highlighting key challenges like irrelevant info rejection and concept recognition.
Abstract: People often make decisions by holistically reasoning over heterogeneous forms of personal information. In this paper, we study the relatively underexplored capability of multimodal personalized reasoning in multimodal large language models (MLLMs). To this end, we introduce MPRBench, the first comprehensive benchmark specifically designed to evaluate personalized reasoning across a wide range of tasks and real-world challenges. MPRBench consists of 12 sub-tasks and additionally supports interpretable analysis of representative error cases. Through extensive experiments, we find that current personalized MLLMs still struggle with personalized reasoning due to several key challenges, including inaccurate recognition of personalized concepts and sensitivity to irrelevant personal information. We hope that MPRBench will stimulate further research on personalized reasoning in MLLMs.
Submission Number: 52
Loading