The Verbose Context Problem in Medical Records

Published: 23 May 2026, Last Modified: 04 Jun 2026SD4H ICML 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: population health, evaluations, long-context inference, electronic health records, prompt compression
TL;DR: PopMedQA (created using the neopatient library) evaluates how well LLMs cope with context bloat caused by lots of medical codes.
Abstract: The verbose context problem occurs when structured concepts have token-inefficient textual representations. This bottleneck is acute in population health: cohort-level analysis of longitudinal patient records requires reasoning over thousands of medically-coded events, often exceeding 400K tokens in total. We present PopMedQA, a benchmark isolating this problem through computational tasks on groups of longitudinal patient records. We construct the benchmark using \texttt{neopatient}, a new library for language-controlled generation of artificial patient records. Through extensive ablations—including prompting strategies, prompt compression, and agentic decomposition—we find that domain-independent methods fail to alleviate the verbose context problem. There remains significant opportunity to exploit domain-specific structure in language model inputs for population-scale reasoning.
Submission Number: 28
Loading