A Survey on Large Language Model Reasoning Failures

Peiyang Song; Pengrui Han; Noah Goodman

A Survey on Large Language Model Reasoning Failures

Peiyang Song, Pengrui Han, Noah Goodman

Published: 09 Jul 2025, Last Modified: 16 Jul 2025AI4Math@ICML25 PosterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Keywords: Large Language Models, Limitation Studies, Cognitive Reasoning, Logical Reasoning, Embodied Reasoning

TL;DR: We present the first comprehensive survey that unifies the previously overlooked, important field of LLM reasoning failures, and provides insights on future research.

Abstract: Reasoning capabilities in Large Language Models (LLMs) have advanced dramatically, enabling impressive performance across diverse tasks. However, alongside these successes, notable reasoning failures frequently arise, even in seemingly straightforward scenarios. To systematically understand and address these issues, we present a comprehensive survey of reasoning failures in LLMs. We propose a clear categorization framework that divides reasoning failures into embodied and non-embodied types, with non-embodied further subdivided into informal (intuitive) and formal (logical) reasoning. For each category, we synthesize and discuss existing studies, identify common failure patterns, and highlight inspirations for mitigation strategies. Our structured perspective unifies fragmented research efforts, provides deeper insights into systemic weaknesses of current LLMs, and aims to motivate future studies toward more robust, reliable, and human-aligned reasoning capabilities.

Submission Number: 8

Loading