Beyond Chunking: Efficient Global Pooling for Holistic Long-Document Representation

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Natural Language Processing, Retrieval-Augmented Generation, Text Classification, Long Document, Multimodal
Abstract: Effectively representing long documents is a persistent challenge in natural language processing, as foundational encoders are constrained by limited context windows. Prevailing methods like chunking create fragmented representations that sever long-range dependencies and lose crucial global context, hindering downstream task performance. To overcome this, we introduce **Spectral Attention Token Pooling (SATPool)**, a novel, encoder-agnostic module that generates a single, holistic vector for a document of any length. SATPool operates in two stages: it first uses an efficient linear attention mechanism to capture global token interactions across the entire document, then employs a novel **Spectral Token Compression (STC)** technique to compress these globally-aware token representations into a compact, context-aware vector. We demonstrate that SATPool consistently and significantly outperforms established baselines through extensive experiments on diverse tasks, including long-document classification, Retrieval-Augmented Generation (RAG), multimodal RAG, and factuality consistency evaluation. Our work presents a practical, plug-and-play solution that unlocks the full potential of pre-trained encoders for long-form text without requiring costly retraining, enabling more robust document-level understanding and retrieval.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 9851
Loading