Track: Scientific Track
Keywords: conversational QA, few-shot QA, generalization, RAG
TL;DR: Introduction of an efficient approach for answering not readily attainable questions for RAG-based applications
Abstract: Retrieval-augmented generation (RAG) is an established method for addressing challenges in applying large language models (LLMs), such as ensuring timeliness, incorporating domain-specific expertise, and minimizing hallucinations. However, the effective application of data-augmented LLMs remains challenging due to, e.g., reliance on retriever performance, token-limit restrictions for the input, or the inherent difficulty of global questions directed at large text corpora. Despite various efforts to address these challenges, there are still instances where finding correct answers to certain questions remains elusive. Moreover, as more modules are added to the RAG pipeline, its complexity and latency increase, so that the achieved performance improvements may become less practically significant. Based on these observations, we propose an efficient approach to addressing the issue of not readily attainable questions in a pragmatic way: by collecting questions with incorrectly generated answers, preparing the correct answers offline, and prepending a module for semantic search among the prepared question-answer pairs to the RAG system. If we consider a traditional RAG system an open-book exam, this QA search module can be likened to an open-question exam, similar to a driver's license test.
Submission Number: 5
Loading