Keywords: Self-interaction with external knowledge, Retriever-free RAG, Unified generator-retriever
TL;DR: FREESON is a retriever-free RAG framework where LMs act as both generator and retriever, using retrieval-specialized MCTS to self-interact with external knowledge and overcome the representation bottleneck of embedding-based retrieval.
Abstract: Large Reasoning Models (LRMs) have demonstrated remarkable capabilities in multi-step reasoning and calling search engines at appropriate steps. However, existing retrieval-augmented reasoning approaches rely on separate retrieval models, limiting the LRM's role in retrieval to deciding when to retrieve and how to query. This separation not only increases hardware and operational costs but also leads to errors in the retrieval process due to the representation bottleneck, a phenomenon where the retriever's embedding space lacks sufficient expressiveness to capture the distinctions required by the generator. To address this, we shift our perspective on retrieval from sequence-to-sequence matching to locating the answer-containing paths within the corpus, and propose a novel framework called FREESON (Retriever-FREE Retrieval-Augmented ReaSONing). This framework enables LRMs to directly interact with external knowledge by acting as both a generator and a retriever, thereby autonomously acquiring relevant information. To achieve this, we introduce a variant of the MCTS algorithm specialized for the retrieval task, which we call CT-MCTS (Corpus-Traversing Monte Carlo Tree Search). In this algorithm, LRMs traverse through the corpus toward answer-containing regions. Experiments on five open-domain QA benchmarks covering both single-hop and multi-hop questions demonstrate that FREESON achieves an average improvement of 14.4% in EM and F1 over four multi-step reasoning models with a separate retriever, and it also performs comparably to the strongest baseline, surpassing it by 3% on PopQA and 2WikiMultihopQA, and by 12% on the fact-checking benchmark FEVER.
Primary Area: generative models
Submission Number: 19237
Loading