We introduce $UncertaintyRAG$, a novel method for long-context Retrieval-Augmented Generation (RAG) that leverages Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks. This span uncertainty improves the calibration of model predictions, enhancing robustness and addressing semantic inconsistencies caused by random chunking. Utilizing this, we develop an efficient unsupervised learning technique for training the retrieval model and design an effective data sampling and scaling strategy. $UncertaintyRAG$ achieves a 2.03% improvement over baselines on LLaMA-2-7B, reaching state-of-the-art performance while using only 4% of the training data compared to other powerful open-source retrieval models under distribution shift settings. Our method demonstrates strong calibration through span uncertainty, resulting in better generalization and robustness in long-context RAG tasks. Moreover, $UncertaintyRAG$ offers a lightweight retrieval model that can be seamlessly integrated into any large language model with varying context window lengths without the need for fine-tuning, highlighting the versatility of our approach.
Keywords: RAG, Long-contex, Distribution Shift, Signal-to-noise Ratio, Unsupervised Learning
Abstract:
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7689
Loading