Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner

ACL ARR 2026 January Submission2996 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: chart understanding and reasoning, vision language models
Abstract: Chart reasoning presents unique challenges due to its inherent complexity—requiring precise numerical comprehension, multi-level visual understanding, and logical inference across interconnected data elements. Existing vision-language models often struggle with such reasoning tasks, particularly when handling multi-subchart scenarios and numerical sensitivity. To address these challenges, we introduce \textbf{Chart-R1}, a chart-domain vision-language model that leverages reinforcement fine-tuning for advanced chart reasoning. We first propose a programmatic data synthesis approach to generate high-quality step-by-step reasoning data with verifiable answer formats, covering diverse chart types and complexity levels. Our two-stage training strategy includes: (1) \textbf{Chart-COT}, which decomposes complex reasoning into interpretable subtasks through chain-of-thought supervision, and (2) \textbf{Chart-RFT}, which employs group relative policy optimization with numerically sensitive rewards tailored for chart-specific reasoning. Experiments on open-source benchmarks and our proposed \textbf{ChartRQA} dataset demonstrate that Chart-R1 significantly outperforms existing chart-domain methods and rivals large-scale open/closed-source models.
Paper Type: Long
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: Multimodality and Language Grounding to Vision, Robotics and Beyond, Question Answering
Contribution Types: Publicly available software and/or pre-trained models, Data analysis
Languages Studied: English
Submission Number: 2996
Loading