Supporting Humans in Evaluating AI Summaries of Legal Depositions

Naghmeh Farzi, Laura Dietz, David D. Lewis

Published: 2026, Last Modified: 15 Apr 2026CHIIR 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: While large language models (LLMs) are increasingly used to summarize long documents, this trend poses significant challenges in the legal domain, where the factual accuracy of deposition summaries is crucial. Nugget‑based methods have been shown to be extremely helpful for the automated evaluation of summarization approaches. In this work, we translate these methods to the user side and explore how nuggets could directly assist end users. Although prior systems have demonstrated the promise of nugget‑based evaluation, its potential to support end users remains underexplored. Focusing on the legal domain, we present a prototype that leverages a factual nugget‑based approach to support legal professionals in two concrete scenarios: (1) determining which of two summaries is better, and (2) manually improving an automatically generated summary.1