Think Like Human: Enhancing Large Language Models via Multi-reward Reinforcement Learning for Fact Verification
Keywords: fact-checking, human thinking alignment, multi-reward reinforcement learning
Abstract: Fact verification aims to verify the truthfulness of claims and statements. Recent LLM-based fact verification methods employ prompt-tuning or supervised fine-tuning mechanisms to achieve this goal. However, fact-checking tasks necessitate that models comprehend and analyze the complex relationships between claims and evidence to make veracity judgments, yet LLMs lack such reasoning capabilities to a certain extent, leading to suboptimal performance in handling fact-checking tasks. In this paper, we propose a novel Thought-enhanced Fact Verification framework via Reinforcement Learning (TFV-RL) for fact verification by aligning LLM to fact-checkers' thinking process via reinforcement learning. We design a novel Multi-Reward mechanism (Multi-R) that designs four rewards to integrate fact-checking objectives into the reinforcement learning process and align LLMs with fact-checkers' thinking process better. The experimental results on three datasets demonstrate that our method has obtained the best performance. Besides, we analyze the impact of different backbones and different training methods, discovering that TFV-RL can align LLMs with fact-checkers' thinking process better. It enables the model to simulate the thinking process of fact-checkers during fact-checking, making more accurate judgments and generating reasoning.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: fact checking
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 3160
Loading