Keywords: Fine-Tuning, RAG, Llama3, parametric knowledge, retrieval, LLM, hallucination
Abstract: Retrieval-augmented generation (RAG) has become a ubiquitous approach to improving response relevance in large language models (LLM), especially as their pre-training data ages. However, due to the complexity of modern RAG systems and their interplay with LLM knowledge cutoffs, a number of open questions remain with respect to obtaining optimal performance from these systems in practical settings. In this work, several steps towards addressing these questions are taken. First, the impact of general knowledge cutoffs on RAG performance is quantified. RAG remains an important factor even when parametric knowledge is updated. Second, we consider the relative utility of fine-tuning various RAG components to improve performance on private data. Coupling base-model fine-tuning with RAG produces strong results, while embedding model tuning is less effective.
Submission Number: 114
Loading