Keywords: State Space Models, Question-answering, Long-context Reading Comprehension
TL;DR: We introduce a distributed document processing framework, which merges independently computed document hidden states from fine-tuned Mamba models, enabling efficient inference across corpora.
Abstract: We investigate whether hidden states from Structured State Space Models (SSMs) can be merged post hoc to support downstream reasoning. Inspired by model souping, we propose a strategy where documents are encoded independently and their representations are pooled, via simple operations like averaging, into a single context state. This approach, which we call document souping, enables modular encoding and reuse without reprocessing the full input for each query. We demonstrate that finetuned Mamba2 models with souped representations achieve competitive or superior performance across multi-hop QA, sparse retrieval, and long-document reasoning tasks compared to the standard monolithic encoding approach. For example, on the RACE and QuALITY benchmarks for long document question answering, our method substantially outperforms a traditional concatenation approach. Crucially, this modular design scales to hundreds of documents---we test up to 256---while delivering substantial savings in inference cost, unlocking new possibilities for large-scale corpus reasoning.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 19975
Loading