Gold Panning: Turning Positional Bias into Signal for Multi-Document LLM Reasoning

Gold Panning: Turning Positional Bias into Signal for Multi-Document LLM Reasoning

ICLR 2026 Conference Submission22258 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLMs, Large Language Models, Positional Bias, Bandit Algorithms, Inference-Time Optimization

TL;DR: We propose a bandit-based algorithm that treats LLM positional bias as a signal, strategically reordering documents to find relevant information with up to 65% fewer model queries than random permutation baselines.

Abstract: Large language models exhibit a strong position bias in multi-document contexts, systematically prioritizing information based on location rather than relevance. While existing approaches treat this bias as noise to be mitigated, we introduce GOLD PANNING BANDITS, a framework that leverages position bias as a diagnostic signal: by reordering documents and observing shifts in the model's responses, we can efficiently identify the most relevant content. We frame the problem of choosing reorderings as a bipartite matching problem. While an optimal assignment can be computed at each iteration with the Hungarian algorithm in $O(N^3)$ time, we propose a greedy $O(N \log N)$ strategy that achieves comparable performance by prioritizing the placement of the most uncertain documents in the most informative positions. Our approach identifies relevant documents using up to 65\% fewer language model queries than random permutation baselines on knowledge-intensive NLP tasks, substantially reducing computational cost without model retraining. This work demonstrates that inherent LLM biases can be transformed from liabilities into assets for efficient, inference-time optimization.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 22258

Loading