TrendFact: A Benchmark for Explainable Hotspot Perception in Fact-Checking with Natural Language Explanation

TrendFact: A Benchmark for Explainable Hotspot Perception in Fact-Checking with Natural Language Explanation

ACL ARR 2025 May Submission3831 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Although fact verification remains fundamental, explanation generation serves as a critical enabler for trustworthy fact-checking systems by producing interpretable rationales and facilitating comprehensive verification processes. However, current benchmarks have limitations that include the lack of impact assessment, insufficient high-quality explanatory annotations, and an English-centric bias. To address these, we introduce TrendFact, the first hotspot perception fact-checking benchmark that comprehensively evaluates fact verification, evidence retrieval, and explanation generation tasks. TrendFact consists of 7,643 carefully curated samples sourced from trending platforms and professional fact-checking datasets, as well as an evidence library of 66,217 entries with publication dates. We further propose two metrics, ECS and HCPI, to complement existing benchmarks by evaluating the system's explanation consistency and hotspot perception capability, respectively. Experimental results show that current fact-checking systems, including advanced RLMs such as DeepSeek-R1, face significant limitations when evaluated on TrendFact, highlighting the real-world challenges posed by it. To enhance the fact-checking capabilities of reasoning large language models (RLMs), we propose FactISR, which integrates dynamic evidence augmentation, evidence triangulation, and an iterative self-reflection mechanism. Accordingly, FactISR effectively improves RLM performance, offering new insights for explainable and complex fact-checking.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking, fact checking

Contribution Types: Reproduction study, Data resources

Languages Studied: Chinese

Keywords: benchmarking, fact checking

Submission Number: 3831

Loading