Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

ACL ARR 2026 January Submission1146 Authors

28 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Claim Verification, Reinforcement Learning

Abstract: Claim verification with large language models (LLMs) has recently attracted growing attention, due to their strong reasoning capabilities and transparent verification processes compared to traditional answer-only judgments. However, existing approaches to online claim verification, which requires iterative evidence retrieval and reasoning, still mainly rely on prompt engineering or pre-designed reasoning workflows, without unified training to improve necessary skills. Therefore, we introduce Veri-R1, an online reinforcement learning (RL) framework that enables an LLM to interact with a search engine and to receive reward signals that explicitly shape its planning, retrieval, and reasoning behaviors. The dynamic interaction between models and retrieval systems more accurately reflects real-world verification scenarios and fosters comprehensive verification skills. Empirical results show that Veri-R1 improves joint accuracy by up to 30% and doubles evidence score, often surpassing its larger-scale model counterparts. Ablation studies further reveal the impact of reward components, and the link between output logits and label accuracy. Our results highlight the effectiveness of online RL for precise and faithful claim verification, and provide a foundation for future research.

Paper Type: Long

Research Area: Information Extraction and Retrieval

Research Area Keywords: Computational Social Science and Cultural Analytics, Information Extraction, Information Retrieval and Text Mining, NLP Applications, Question Answering

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 1146

Loading