E-Verify: A Paradigm Shift to Scalable Embedding-based Factuality Verification

E-Verify: A Paradigm Shift to Scalable Embedding-based Factuality Verification

ACL ARR 2025 May Submission2489 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) exhibit remarkable text-generation capabilities, yet struggle with factual consistency, motivating growing interest in factuality verification. Existing factuality verification methods typically follow a \textit{Decompose-Then-Verify} paradigm, which improves granularity but suffers from poor scalability and efficiency. We propose a novel \textbf{\textit{Decompose-Embed-Interact}} paradigm that shifts factuality verification from costly text-level reasoning to efficient alignment in embedding space, effectively mitigating the scalability bottlenecks and computational inefficiencies inherent to prior approaches. While the proposed paradigm promises scalable verification, its implementation faces three practical challenges: efficient decomposition, factually faithful embedding, and accurate verification in embedding space. To address these challenges, we introduce \textbf{E-Verify}, a lightweight framework that resolves them through three specially designed modules, each aligned with a specific stage of the paradigm and designed to preserve scalability and efficiency. Experiments demonstrate that E-Verify significantly improves both decomposition and verification efficiency while maintaining competitive accuracy. These results confirm that the proposed paradigm enables scalable and fine-grained factuality verification with minimal performance trade-offs.

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: Efficient/Low-Resource Methods for NLP, Generation, NLP Applications, Semantics: Lexical and Sentence-Level

Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 2489

Loading