LUQ: Language Models Uncertainty Quantification Toolkit

Alexander V Nikitin; Martin Trapp; Pekka Marttinen

LUQ: Language Models Uncertainty Quantification Toolkit

Alexander V Nikitin, Martin Trapp, Pekka Marttinen

Published: 09 Jun 2025, Last Modified: 14 Jul 2025CODEML@ICML25EveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLMs, Trustworthy AI, Uncertainty Quantification, Software

TL;DR: A comprehensive toolkit for uncertainty quantification in LLMs, featuring methods, datasets, and evaluation techniques.

Abstract: Uncertainty quantification is a principled approach to ensuring the robustness, reliability, and safety of large language models (LLMs). However, progress in this field is hindered by the lack of a unified framework for benchmarking these methods. Additionally, creating suitable datasets for uncertainty quantification is computationally demanding because it often requires sampling LLMs multiple times per each sample. In this work, we propose and describe a software framework that (i) unifies the benchmarking of uncertainty quantification methods for language models, and (ii) provides an easy-to-use tool for practitioners aiming to develop more robust and safer LLM applications.

Submission Number: 47

Loading