Document Summarization with Conformal Importance Guarantees

Bruce Kuwahara; Chen-Yuan Lin; Xiao Shi Huang; Kin Kwan Leung; Jullian Arta Yapeter; Ilya Stanevich; Felipe Perez; Jesse C. Cresswell

Document Summarization with Conformal Importance Guarantees

Bruce Kuwahara, Chen-Yuan Lin, Xiao Shi Huang, Kin Kwan Leung, Jullian Arta Yapeter, Ilya Stanevich, Felipe Perez, Jesse C. Cresswell

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0

Keywords: Document Summarization, Conformal Prediction, Large Language Models

TL;DR: We apply conformal prediction to provide statistical guarantees that all important information within a long-text is captured by an automatically generated summary.

Abstract: Automatic summarization systems have advanced rapidly with large language models (LLMs), yet they still lack reliable guarantees on inclusion of critical content in high-stakes domains like healthcare, law, and finance. In this work, we introduce Conformal Importance Summarization, the first framework for importance-preserving summary generation which uses conformal prediction to provide rigorous, distribution-free coverage guarantees. By calibrating thresholds on sentence-level importance scores, we enable extractive document summarization with user-specified coverage and recall rates over critical content. Our method is model-agnostic, requires only a small calibration set, and seamlessly integrates with existing black-box LLMs. Experiments on established summarization benchmarks demonstrate that Conformal Importance Summarization achieves the theoretically assured information coverage rate. Our work suggests that Conformal Importance Summarization can be combined with existing techniques to achieve reliable, controllable automatic summarization, paving the way for safer deployment of AI summarization tools in critical applications. Code is available at github.com/layer6ai-labs/conformal-importance-summarization.

Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)

Submission Number: 9684

Loading