Selective Risk Certification for LLM Outputs via Information-Lift Statistics: PAC-Bayes, Robustness, and Skeleton Design
Keywords: LLM risk
Abstract: Large language models frequently generate confident but incorrect outputs, requiring formal uncertainty quantification with abstention guarantees. We develop information-lift certificates that compare model probabilities to a skeleton baseline, accumulating evidence into sub-gamma PAC-Bayes bounds valid under heavy-tailed distributions. Across eight datasets, our method achieves $77.2$\% coverage at $2$\% risk, outperforming recent 2023-2024 baselines by 8.6-15.1 percentage points, while blocking $96$\% of critical errors in high-stakes scenarios vs $18-31$\% for entropy methods. Limitations include skeleton dependence and frequency-only (not severity-aware) risk control, though performance degrades gracefully under corruption.
Primary Area: causal reasoning
Submission Number: 20299
Loading