TRUST Agents 2.0: Comparing Two Agentic Fact-Checking Pipelines for Explainable and Uncertainty-Aware Verification

Satya Subrahmanya Gautama Shastry Bulusu Venkata; Santhosh Kakarla; Aishwarya Gaddam; Maheedhar Sai Omtri Mohan

TRUST Agents 2.0: Comparing Two Agentic Fact-Checking Pipelines for Explainable and Uncertainty-Aware Verification

Satya Subrahmanya Gautama Shastry Bulusu Venkata, Santhosh Kakarla, Aishwarya Gaddam, Maheedhar Sai Omtri Mohan

Published: 28 Apr 2026, Last Modified: 28 Apr 2026MSLD 2026 PosterEveryoneRevisionsCC BY 4.0

Keywords: Natural Language Processing

Abstract: Automated fact-checking is commonly evaluated as a label prediction problem, but practical systems must solve a richer sequence of tasks: identify verifiable claims, retrieve relevant evidence, reason over partial or conflicting support, and produce explanations that humans can inspect. In this paper, we design and compare \emph{two agentic fact-checking pipelines} developed within the same TRUST Agents project framework. The first is a baseline four-agent pipeline (claim extraction, evidence retrieval, verification, and explanation). The second, TRUST Agents 2.0, adds three modules for compound-claim reasoning: a LoCal-inspired decomposer that generates atomic sub-claims and logical structure, a Delphi-style multi-agent jury that performs trust-weighted deliberative verification, and a logic aggregator that recomposes atomic verdicts into a final claim-level decision. Both pipelines combine tool-augmented LLM agents with hybrid retrieval (BM25 + FAISS), and are evaluated on LIAR against fine-tuned BERT, fine-tuned RoBERTa, and a zero-shot GPT baseline. The results show that fine-tuned discriminative baselines remain stronger on raw benchmark accuracy, while the advanced agentic pipeline improves over our baseline agentic pipeline under optimistic uncertainty mapping and provides substantially richer traceability, explicit intermediate reasoning artifacts, and abstention behavior. Our central finding is that agentic design materially improves what a fact-checking system can expose, diagnose, and control, even when benchmark metrics remain constrained by uncertainty rates, retrieval coverage, and binary evaluation protocols.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 72

Loading