TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

Danna Zheng; Danyang Liu; Mirella Lapata; Jeff Z. Pan

TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

Danna Zheng, Danyang Liu, Mirella Lapata, Jeff Z. Pan

Published: 04 Mar 2024, Last Modified: 14 Apr 2024SeT LLM @ ICLR 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Evaluation, Trustworthiness

Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, prompting a surge in their practical applications. However, concerns have arisen regarding the trustworthiness of LLMs' outputs, particularly in closed-book question-answering tasks, where non-experts may struggle to identify inaccuracies due to the absence of contextual or ground truth information. This paper introduces TrustScore, a framework based on the concept of Behavioral Consistency, which evaluates whether an LLM's response aligns with its intrinsic knowledge. Additionally, TrustScore can seamlessly integrate with fact-checking methods, which assesses alignment with external knowledge sources. The experimental results show that TrustScore achieves strong correlations with human judgments, surpassing existing reference-free metrics, and achieving results on par with reference-based metrics.

Submission Number: 66

Loading