MMRC: Measuring Massive-Computational Math Reasoning with Code in LLMs

Xie Zhifei; Fangda Ye; Yihang Yin; Shangding Yu; Zibo Meng; Moyang Chen; Tongrui Su; Chenkai Gao; Zisheng Wang; Zihang Liu; Liwei Liang; Tianwei Wang; Aojun Zhou; Yuxin Hu; Pang Kaiyu; Kuicai Dong; Yue Liao; Shuicheng YAN

MMRC: Measuring Massive-Computational Math Reasoning with Code in LLMs

Xie Zhifei, Fangda Ye, Yihang Yin, Shangding Yu, Zibo Meng, Moyang Chen, Tongrui Su, Chenkai Gao, Zisheng Wang, Zihang Liu, Liwei Liang, Tianwei Wang, Aojun Zhou, Yuxin Hu, Pang Kaiyu, Kuicai Dong, Yue Liao, Shuicheng YAN

18 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Benchmark; Math-code integrated reasoning;

TL;DR: A fully human-collected and adapted high-computational-load benchmark for math-code integrated reasoning.

Abstract: We introduce MMRC, a benchmark of 500 curated problems designed to evaluate large language models (LLMs) on their ability to integrate mathematical reasoning with code execution. Unlike existing benchmarks that emphasize either elementary or olympiad-level mathematics, MMRC focuses on university-level subjects including calculus, discrete mathematics, linear algebra, linear programming, mathematical physics, numerical computation, and scientific computing. Each problem is deliberately adapted to impose substantial computational workloads, making text-only reasoning infeasible and requiring code execution for accurate solutions. We evaluate 120 open- and closed-source models on MMRC under two paradigms: Code-Invoked Reasoning (CIR), where models generate Python scripts, and Math-Code Agent (MCA), where models dynamically interact with a Python interpreter. Code integration consistently improves accuracy on complex tasks while reducing token usage, establishing MMRC as the first systematic benchmark for math–code integrated reasoning and advancing LLMs toward real-world problem solving.

Primary Area: datasets and benchmarks

Submission Number: 10756

Loading