Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner

Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner

ACL ARR 2026 January Submission2996 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: chart understanding and reasoning, vision language models

Abstract: Chart reasoning presents unique challenges due to its inherent complexity—requiring precise numerical comprehension, multi-level visual understanding, and logical inference across interconnected data elements. Existing vision-language models often struggle with such reasoning tasks, particularly when handling multi-subchart scenarios and numerical sensitivity. To address these challenges, we introduce \textbf{Chart-R1}, a chart-domain vision-language model that leverages reinforcement fine-tuning for advanced chart reasoning. We first propose a programmatic data synthesis approach to generate high-quality step-by-step reasoning data with verifiable answer formats, covering diverse chart types and complexity levels. Our two-stage training strategy includes: (1) \textbf{Chart-COT}, which decomposes complex reasoning into interpretable subtasks through chain-of-thought supervision, and (2) \textbf{Chart-RFT}, which employs group relative policy optimization with numerically sensitive rewards tailored for chart-specific reasoning. Experiments on open-source benchmarks and our proposed \textbf{ChartRQA} dataset demonstrate that Chart-R1 significantly outperforms existing chart-domain methods and rivals large-scale open/closed-source models.

Paper Type: Long

Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond

Research Area Keywords: Multimodality and Language Grounding to Vision, Robotics and Beyond, Question Answering

Contribution Types: Publicly available software and/or pre-trained models, Data analysis

Languages Studied: English

Submission Number: 2996

Loading