Conformal Language Generation with Collaborative Ranking and Dynamic Thresholds

Conformal Language Generation with Collaborative Ranking and Dynamic Thresholds

ICLR 2026 Conference Submission16167 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Conformal prediction, large language model, conditional validity

Abstract: Large language models (LLMs) face significant challenges in providing reliable uncertainty quantification for language generation. We introduce a novel conformal prediction framework specifically designed to enhance this reliability through Collaborative Ranking and Dynamic Thresholds. Our method innovatively departs from traditional metrics by harnessing advanced LLM capabilities for comparative judgment, allowing it to rank candidate responses and form a robust, rank-based nonconformity score. This approach enables the construction of prediction sets with rigorous statistical guarantees that inherently adapt to diverse input difficulties and prompt complexities. Extensive experiments across varied question-answering domains consistently demonstrate significant improvements in conditional coverage, delivering precisely calibrated LLM outputs demanding extended reasoning and factual accuracy. We have provided code with implementation details in the repository below: https://anonymous.4open.science/r/512499.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 16167

Loading