Cumulative Reasoning with Large Language Models

Published: 26 Jul 2025, Last Modified: 26 Jul 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Recent advancements in large language models (LLMs) have shown remarkable progress, yet their ability to solve complex problems remains limited. In this work, we introduce Cumulative Reasoning (CR), a structured framework that enhances LLM problem-solving by emulating human-like iterative and cumulative thought processes. CR orchestrates LLMs in three distinct roles---Proposer, Verifier(s), and Reporter---to systematically decompose tasks, generate and validate intermediate reasoning steps, and compose them into a solution by building a dynamic Directed Acyclic Graph (DAG) of verified propositions. This approach substantially enhances problem-solving capabilities. We demonstrate CR’s advantage through several complex reasoning tasks: it outperforms existing methods in logical inference tasks with up to a 9.3% improvement, achieving 98.04% accuracy on the curated FOLIO wiki dataset. In the Game of 24, it achieves 98% accuracy, marking a 24% improvement over previous methods. In solving MATH problems, CR achieves a 4.2% increase from previous methods and a 43% relative improvement in the most challenging level 5 problems. When incorporating a code environment with CR, we further harness LLMs’ reasoning capabilities and outperform the Program of Thought (PoT) method by 38.8%. The code is available at https://github.com/iiis-ai/cumulative-reasoning.
Submission Length: Regular submission (no more than 12 pages of main content)
Code: https://github.com/iiis-ai/cumulative-reasoning
Supplementary Material: zip
Assigned Action Editor: ~Alessandro_Sordoni1
Submission Number: 4479
Loading