When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLMs for coding, LLMs for math, Human Study, HCI, AI-assisted ideation
TL;DR: We conduct a user study to evaluate how well language models help humans internalize their reasoning, revealing that strong model performance alone doesn't guarantee effective reasoning transfer.
Abstract: As large language models (LLMs) increasingly serve as close collaborators for humans, it is crucial that they express their reasoning in ways that humans can understand and learn from. However, this capability remains relatively less understood and under-evaluated. To address this, we introduce a conceptual framework for such Human-AI knowledge transfer capabilities and conduct the first large-scale user study (N=118) explicitly designed to measure it. In our two-phase setup, humans first ideate with an LLM on problem-solving strategies, then independently implement solutions, isolating the influence of model reasoning on human understanding. Our findings reveal that while model benchmark performance correlates with collaborative outcomes, this relationship is notably inconsistent with significant outliers, highlighting that knowledge transfer is a distinct capability requiring dedicated optimization. Our analysis uncovers behavioral and strategic factors that mediate successful knowledge transfer, and we release our code, dataset, and evaluation framework to support future work on communicatively aligned models.
Supplementary Material: zip
Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 18113
Loading