UncertaintyRAG: Span Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

Zixuan Li; Jing Xiong; Fanghua Ye; Chuanyang Zheng; Xun Wu; Jianqiao Lu; Zhongwei Wan; Xiaodan Liang; Chengming Li; Zhenan Sun; Lingpeng Kong; Ngai Wong

UncertaintyRAG: Span Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

Zixuan Li, Jing Xiong, Fanghua Ye, Chuanyang Zheng, Xun Wu, Jianqiao Lu, Zhongwei Wan, Xiaodan Liang, Chengming Li, Zhenan Sun, Lingpeng Kong, Ngai Wong

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: RAG, Long-contex, Distribution Shift, Signal-to-noise Ratio, Unsupervised Learning

Abstract: We introduce $UncertaintyRAG$, a novel method for long-context Retrieval-Augmented Generation (RAG) that leverages Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks. This span uncertainty improves the calibration of model predictions, enhancing robustness and addressing semantic inconsistencies caused by random chunking. Utilizing this, we develop an efficient unsupervised learning technique for training the retrieval model and design an effective data sampling and scaling strategy. $UncertaintyRAG$ achieves a 2.03\% improvement over baselines on LLaMA-2-7B, reaching state-of-the-art performance while using only 4\% of the training data compared to other powerful open-source retrieval models under distribution shift settings. Our method demonstrates strong calibration through span uncertainty, resulting in better generalization and robustness in long-context RAG tasks. Moreover, $UncertaintyRAG$ offers a lightweight retrieval model that can be seamlessly integrated into any large language model with varying context window lengths without the need for fine-tuning, highlighting the versatility of our approach.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7689

Loading