Abstract: Large language models (LLMs) excel in many reasoning tasks but continue to face significant challenges, such as lack of robustness in reasoning, struggling with cross-task generalization, and inefficiencies in scaling up reasoning capabilities. Current training paradigms, including next-token prediction and reinforcement learning from human feedback, often fall short in adaptability to diverse reasoning tasks. Existing approaches, such as prompt optimization and iterative output refinement, offer performance improvement, but can be inefficient and lack effective generalization. To overcome these limitations, this position paper argues for a transformative shift in how LLMs approach reasoning. Drawing inspiration from cognitive science, particularly meta-reasoning theories such as Dual-Process Theory and Metacognitive Reasoning, we propose a Bayesian meta-reasoning framework for LLMs. Our approach integrates self-awareness, monitoring, evaluation, regulation, and meta-reflection, to enhance LLMs' ability to refine reasoning strategies and generalize across tasks. We revisit existing LLM reasoning methods, identify key challenges, and suggest directions for future research.
Lay Summary: Large language models (LLMs) are getting better at solving different reasoning problems. But they still stumble when asked to reason clearly, adapt to new kinds of problems, or explain their thinking. Why does this happen, and how can we fix it? In this work, we take inspiration from how humans reason — especially from psychology theories about how people monitor and adjust their own thinking. We argue that LLMs should be trained not just to give answers, but also to reflect on how they’re reasoning, much like a person might double-check their logic or change strategies when something feels off. We introduce a new framework that helps LLMs become more self-aware and adaptive by borrowing ideas from cognitive science, such as the “dual-process theory” of fast vs. slow thinking. This framework encourages models to evaluate their own thought processes, regulate their responses, and learn how to generalize their reasoning across a wide range of tasks. Our goal is to spark a shift in how AI systems learn to reason — moving from static answering machines toward dynamic thinkers that can flexibly solve problems and explain their own reasoning. We've also shared tools and resources to help researchers explore this new approach.
Link To Code: https://github.com/hanqi-qi/LLM_MetaReasoning
Primary Area: Research Priorities, Methodology, and Evaluation
Keywords: Large language models, reasoning, meta-reasoning
Submission Number: 199
Loading