TRIDENT: Risk-Controlled Min-Cost Facet Cover for Efficient and Faithful RAG

TRIDENT: Risk-Controlled Min-Cost Facet Cover for Efficient and Faithful RAG

ACL ARR 2026 January Submission3291 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Retrieval-Augmented Generation, Large Language Model

Abstract: Budgeted RAG is a decision problem: under a hard evidence cap, which passages you keep determines both accuracy and what you can credibly claim about the evidence at query time. We introduce TRIDENT, a framework that mines auditable reasoning facets, tests facet support with a calibrated verifier, and selects evidence under an explicit token budget. In the Safe-Cover regime, we freeze the retrieval pipeline into a replayable episode, map verifier scores to selection-conditional conformal p-values under a logged contract, and apply per-query multiple-testing control to yield facet-support certificates—or return a machine-checkable abstention with a reason code. In Pareto-Knapsack, we drop per-query guarantees and optimize a quality--cost frontier for throughput. On HotpotQA at a 500-token evidence cap, TRIDENT Pareto-500 improves EM/F1 from 30.81/39.61 to 45.30/58.22 (+47% relative), while using 3% fewer evidence tokens and 5% lower latency than naive top-k truncation. These results show that under tight budgets, selection rigor and query-time evidence accountability matter as much as retrieval strength.

Paper Type: Long

Research Area: Retrieval-Augmented Language Models

Research Area Keywords: Language Modeling, Information Retrieval and Text Mining

Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 3291

Loading