SGEU: enhancing LLM reasoning via backward exemplar generation and verification

Published: 2025, Last Modified: 07 Jan 2026Appl. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In-context learning has emerged as an effective method to enhance the reasoning capabilities of Large Language Models (LLMs) across diverse applications, eliciting chain-of-thought abilities through reasoning exemplars with rationales. Nevertheless, methods relying on limited, expert-annotated exemplars hinder the continual adaptation of LLMs to emerging challenges. Automatic exemplar construction methods based on forward reasoning may yield flawed rationales, particularly for tasks in which LLMs underperform. Additionally, few-shot CoT exhibit sensitivity to the exemplar selection, especially for the flawed exemplars. To automatically generate high-quality rationale exemplars for reasoning enhancement, we propose a self-enhancement framework, including the Self-Generation, Evaluation, and Utilization (SGEU) of exemplars from historical data, facilitating LLMs in acquiring high-quality rationales for the self-enhancement. Specifically, in the exemplar collection stage, given historical data, LLMs first perform backward reasoning, rather than forward reasoning, to avoid the answer reasoning and generate more accurate rationales, resulting in an initial sample-rationale dataset. Subsequently, we propose a self-verification module and a mutual verification module to filter high-quality exemplars for the final sample-rationale collection. In the testing stage, SGEU employs a text embedding model to retrieve similar reasoning exemplars for in-context learning, enabling precise reasoning. Experiments across four complex tasks demonstrate the superior performance of SGEU, outperforming competitive methods of 4.8, 6.9, 2.1, and 0.7 on legal judgment prediction, social bias prediction, arithmetic reasoning, and commonsense reasoning, respectively. Furthermore, we demonstrate that backward reasoning outperforms the forward reasoning in rationale generation through human evaluation and verification methods effectively filter high-quality exemplars. The code will be made publicly available (https://drive.google.com/file/d/1GcFINcMLpuj-3RxzGst4dFmxX5XCPyrF/view?usp=sharing).
Loading