The release of OpenAI's o1 marks a significant milestone in AI, achieving proficiency comparable to PhD-level expertise in mathematics and coding. While o1 excels at solving complex reasoning tasks, it remains a closed-resource model, limiting its accessibility and broader application in academic and industrial contexts. Despite numerous efforts to replicate o1's results, these attempts often focus on isolated aspects of the model (e.g., training, inference), neglecting the holistic interplay between components and failing to provide a global picture of the pathways to enhance LLMs' reasoning capabilities, and replicate o1's performance. Currently, there is no systematic review of these replication efforts, nor a clear survey of the major issues that must be addressed to achieve comparable performance to o1.
In this survey, we are going to provide a systematic review of the most up-to-date state of knowledge on reasoning LLMs, helping researchers understand the current challenges and advancements in this field. We will summarize recent efforts to replicate o1's performances, and more importantly, address the key obstacles in enhancing the reasoning abilities. We will (1) review the basic concepts and techniques behind o1 and efforts in replicating o1 models; (2) detail efforts in constructing logical and structured reasoning datasets; (3) delve into important training techniques (e.g., supervised fine-tuning, reinforcement learning, DPO) that harness these datasets to ensure the model acquires robust logical reasoning and structured problem-solving capabilities; and (4) explores different inference techniques (e.g., tree-of-thoughts, critic-based approaches, self-correction strategies) in reasoning LLMs that assist in navigating the problem space and identifying efficient problem-solving paths. We will also summarize the current challenges and discuss opportunities for further improvement of reasoning LLMs.