ThinknCheck: Advancing Claim Verification with Compact, Reasoning-Driven, and Interpretable Models

ACL ARR 2025 May Submission5318 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We introduce ThinknCheck, a reasoning-optimized claim verification model that explicitly generates explanation chains before making verification decisions. This Gemma3-based 1B parameter model, fine-tuned on our new LLMAggreFact-Think dataset, achieves 78.1\% balanced accuracy on the LLMAggreFact benchmark, outperforming the 7B MiniCheck model (current SOTA) while requiring substantially less computational resources. Explicit reasoning significantly enhances verification accuracy (+20.6 points over non-reasoning ablation) and improves out-of-domain generalization (+14.7 points on scientific claims). Qualitative analysis of reasoning traces revealed distinct patterns, with surface-level evidence matching dominating current datasets; complex synthesis in claim verification remains underrepresented. To evaluate numerical reasoning, we contribute GSMClaims, a dataset reformulating grade school math problems as verification tasks. Error analysis identified domain-specific patterns, informing our specialized ThinknCheck-Science variant with substantial performance gains across all benchmarks. Reasoning-first approaches are a promising direction for more accurate, edge-device friendly, interpretable, and generalizable claim verification systems across diverse domains.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Efficient/Low-Resource Methods for NLP, Interpretability and Analysis of Models for NLP, NLP Applications, Resources and Evaluation,
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: English
Submission Number: 5318
Loading