MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem

Fan Liu; Zhe-Rui Yang; Cancheng Liu; Tianrui Song; Xiaofeng Gao; Hao Liu

MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem

Fan Liu, Zhe-Rui Yang, Cancheng Liu, Tianrui Song, Xiaofeng Gao, Hao Liu

Published: 09 Jul 2025, Last Modified: 16 Jul 2025AI4Math@ICML25 PosterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Keywords: Mathematical Modeling Agent, LLM Agent, LLM

TL;DR: We introduce MM-Agent and MM-Bench to enable and evaluate LLM-powered end-to-end mathematical modeling, achieving superior performance on real-world problems and Finalist Award in MCM/ICM 2025.

Abstract: Mathematical modeling is a cornerstone of scientific discovery and engineering practice, enabling the translation of real-world problems into formal systems across domains such as physics, biology, and economics. Unlike mathematical reasoning, which assumes a predefined formulation, modeling requires open-ended problem analysis, abstraction, and principled formalization. While Large Language Models (LLMs) have shown strong reasoning capabilities, they fall short in rigorous model construction, limiting their utility in real-world problem-solving. To this end, we formalize the task of LLM-powered real-world mathematical modeling, where agents must analyze problems, construct domain-appropriate formulations, and generate complete end-to-end solutions.We introduce MM-Bench, a curated benchmark of 111 problems from the Mathematical Contest in Modeling (MCM/ICM), spanning the years 2000 to 2025 and across ten diverse domains such as physics, biology, and economics. To tackle this task, we propose MM-Agent, an expert-inspired framework that decomposes mathematical modeling into four stages: open-ended problem analysis, structured model formulation, computational problem solving, and report generation.Experiments on MM-Bench show that MM-Agent significantly outperforms baseline agents, achieving an 11.88% improvement over human expert solutions while requiring only 15 minutes and $0.88 per task using GPT-4o. Furthermore, under official MCM/ICM protocols, MM-Agent assisted two undergraduate teams in winning the Finalist Award (\textbf{top 2.0% among 27,456 teams}) in MCM/ICM 2025, demonstrating its practical effectiveness as a modeling copilot.

Submission Number: 16

Loading