ChartMaster: Boosting MLLMs for Chart Analysis through Data, Perception, and Reasoning Optimization

07 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Chart Analysis, Multimodal Large Language Models
TL;DR: ChartMaster enhances MLLMs' chart analysis by improving their fine-grained perception and enabling efficient reasoning.
Abstract: Multimodal Large Language Models (MLLMs) have demonstrated significant potential in understanding visual information, yet they often fall short in the complex domain of chart analysis. Existing models often struggle to accurately capture detailed visual elements and to perform efficient multi-step reasoning. To address these challenges, we introduce ChartMaster, a holistic framework that systematically advances chart analysis by jointly optimizing data, perception, and reasoning. Our approach is built on three core innovations. First, we construct ChartVerse, a large-scale synthetic dataset with diverse chart types, rendering styles, and reasoning levels. Building on this foundation, we introduce a novel two-stage training paradigm: (i) Multi-Negative Direct Preference Optimization (MNDPO), which improves perceptual precision by training models to distinguish correct answers from carefully designed hard negative samples (i.e., plausible but incorrect alternatives); and (ii) Reinforcement Learning with Dynamic Length Reward (DLR), which adapts chain-of-thought reasoning to task complexity, encouraging concise solutions for simple queries and rigorous multi-step reasoning for complex ones. Extensive experiments across six benchmarks demonstrate that ChartMaster achieves state-of-the-art performance, surpassing prior chart-domain models and rivaling proprietary systems. These results highlight that coupling diverse data foundations with targeted perceptual and reasoning optimization provides an effective pathway toward robust chart understanding in MLLMs.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 2753
Loading