Quantifying Information Gain and Redundancy in Multi-Turn LLM Conversations

Published: 06 Oct 2025, Last Modified: 04 Nov 2025MTI-LLM @ NeurIPS 2025 OralEveryoneRevisionsBibTeXCC BY-ND 4.0
Keywords: information-theoretic evaluation, multi-turn communication channels, mutual information
TL;DR: information-theoretic metrics towards evaluating and improving multi-turn LLM conversations
Abstract: Large language models (LLMs) are increasingly used in multi-turn settings, yet we lack standardized ways to measure how much new information each turn contributes and how much of the token budget is wasted on redundancy. We propose two operational metrics: (i) Information Gain per Turn (IGT), measuring the new information (in bits) contributed by a model’s response at each turn, and (ii) Token Waste Ratio (TWR), the fraction of a response that is redundant given the conversation history. We derive an IGT–TWR coupling via data-processing arguments and define an interactive-channel capacity $C_{int}$: the per-turn upper bound for two-way, context-dependent exchange. Across four studies (controlled Q&A, cross-model comparison, decoding effects, capacity stress), results on GPT-4o, Claude-3-Sonnet, GPT-3.5-Turbo and LLaMA-3 70B align with theory: IGT decays without fresh information, TWR rises under deterministic decoding, and models operate well below $C_{int}$ due to repetition and forgetting. To probe failure modes, we design two diagnostics: E5 (independence sweep) shows that unrelated questions do not degrade IGT relative to a no-history baseline, and E6 (filler injection) quantifies the content–connective token tradeoff. We conclude by discussing implications for building more robust, information-efficient dialogue systems and alignment techniques to mitigate conversational drift.
Supplementary Material: pdf
Submission Number: 216
Loading