Think Like Human: Enhancing Large Language Models via Multi-reward Reinforcement Learning for Fact Veriﬁcation

Think Like Human: Enhancing Large Language Models via Multi-reward Reinforcement Learning for Fact Veriﬁcation

ACL ARR 2026 January Submission3160 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: fact-checking, human thinking alignment, multi-reward reinforcement learning

Abstract: Fact verification aims to verify the truthfulness of claims and statements. Recent LLM-based fact verification methods employ prompt-tuning or supervised fine-tuning mechanisms to achieve this goal. However, fact-checking tasks necessitate that models comprehend and analyze the complex relationships between claims and evidence to make veracity judgments, yet LLMs lack such reasoning capabilities to a certain extent, leading to suboptimal performance in handling fact-checking tasks. In this paper, we propose a novel Thought-enhanced Fact Verification framework via Reinforcement Learning (TFV-RL) for fact verification by aligning LLM to fact-checkers' thinking process via reinforcement learning. We design a novel Multi-Reward mechanism (Multi-R) that designs four rewards to integrate fact-checking objectives into the reinforcement learning process and align LLMs with fact-checkers' thinking process better. The experimental results on three datasets demonstrate that our method has obtained the best performance. Besides, we analyze the impact of different backbones and different training methods, discovering that TFV-RL can align LLMs with fact-checkers' thinking process better. It enables the model to simulate the thinking process of fact-checkers during fact-checking, making more accurate judgments and generating reasoning.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: fact checking

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 3160

Loading