Abstract: Large language models have garnered significant attention due to their demonstrated strong logical reasoning capabilities. However, there is still ample room for improvement in handling complex long-term reasoning problems. Direct fine-tuning LLM on domain-specific data is an effective approach, but it requires substantial financial costs. Another research line aims to efficiently enhance performance by leveraging the model's inherent capabilities without tuning any parameters. They either utilize LLM's fact evaluation ability for self-verification during logical reasoning or employ voting methods to improve the consistency of the model's decisions. However, due to inherent limitations in specific domains, the benefits of self-verification approaches are typically limited. In this paper, we propose a compromised method, which involves training a small verification model to evaluate the reasoning process of large models. To overcome the error propagation problem of traditional verification model, we further propose a contrast-enhanced verification model training framework. The experimental results show that our proposed effective and efficient verifier (EEV) can achieve substantial performance gains on five datasets for multi-hop fact reasoning and long-term mathematical reasoning at a small cost.
Paper Type: long
Research Area: Question Answering
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English
0 Replies
Loading