Abstract: Retrieval-augmented text generation attribution is of great significance for knowledge-intensive tasks as it can enhance the credibility and verifiability of large language models (LLMs). However, existing research often ignores the adverse effect of “Middle Loss” in lengthy input contexts on answer correctness, and the potential negative impact of unverified citations on the quality of attribution. To address these challenges, we propose a framework IVAKF (Iterative Verified Attribution with Keyword Fronting), which better utilizes long context information and integrates attribution verification throughout the whole process of response generation. Specifically, for the “Middle Loss” issue, we employ a keyword fronting strategy with Named Entity Recognition (NER), guiding the model’s attention to focus on key entities and their relationship with other parts. As for the issue of poor attribution quality, we design a verification-based iterative optimization algorithm, which continuously updates candidate statements and citations until it produces a satisfactory output result. Experiments on three public knowledge-intensive datasets demonstrate that the proposed framework significantly improves the quality of the final response. It improved answer correctness by 6.4%, and citation quality by 9.1% than the baselines.
Loading