AggLCF: Aggregation Enhanced Localized Conformal Factuality for Large Language Models

Xinbo Wang; Guangzheng Hu; Liuhua Peng; Changliang Zou

AggLCF: Aggregation Enhanced Localized Conformal Factuality for Large Language Models

Xinbo Wang, Guangzheng Hu, Liuhua Peng, Changliang Zou

18 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Conformal Factuality, Local Coverage, Model Aggregation

Abstract: With the growing generative capabilities of large language models (LLMs) in question answering, their practical deployment is hindered by unreliable outputs. Conformal methods have been introduced to control sub-claim factuality with theoretical guarantees. In particular, Conformal Factuality (CF) offers theoretical marginal guarantees controlling the overall error rate below $\alpha$ via a global filtering threshold. Conditional Conformal (CC), aiming to improve information retention via conditional conformal, learns localized thresholds that optimize the number of retained sub-claims using a user-specific function class. However, the unstable local coverage limits its performance due to the sensitivity to the choice of function class, while significantly driving up the training cost from re-computation of conformal prediction on each gradient update.To address these issues, we propose a lightweight framework that offers Localized Conformal Factuality enhanced by multi-model Aggregation (AggLCF) with rigorous marginal coverage guarantees. By semantically clustering diverse responses from multiple LLMs and extracting structured features, AggLCF learns a localized threshold that empirically achieves $1 - \alpha$ coverage per question while maximizing information retention. Without requiring fine-tuning or using any user-specific function class and re-computation, AggLCF outperforms the previous state-of-the-art in conditional conformal methods, and achieves both marginal and localized coverage on challenging inputs on the MedLFQA benchmark with the highest number of retained valid sub-claims.

Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)

Submission Number: 10741

Loading