Keywords: AutoCompressor, Long Context, Information Theory
Abstract: The long-context bottleneck of transformer-based language models can be addressed via context compression frameworks such as AutoCompressors, which distill tokens into \textbf{soft prompts} but silently assume uniform information density. We revisit this assumption and introduce dynamic segmentation by partitioning the input whenever the cumulative token-level \textbf{surprisal} exceeds a threshold $\tau$, yielding segments with balanced information before \textbf{summary vector} generation. We show that dynamically adjusting segment boundaries based on surprisal enables better alignment between the original and soft prompts for prediction and inference. Experimental results show that our surprisal-based segmentation outperforms a pretrained baseline model and the randomized segmentation AutoCompressor baseline with regard to cross-entropy loss and in-context learning (ICL) accuracy.
Submission Number: 172
Loading