Abstract: Retrieval-augmented question answering (QA) integrates external information, and thereby increases the QA accuracy of reader models that lack domain knowledge. However, documents retrieved for closed domains require high expertise, so the reader model may have difficulty fully comprehending the text. Moreover, the retrieved documents contain thousands of tokens, some unrelated to the question. As a result, the documents include some inaccurate information, which may lead the reader model to mistrust the passages and could result in hallucinations. To solve these problems, we propose $\textbf{K-comp}$ ($\textbf{K}$nowledge-injected $\textbf{COMP}$ressor) which provides the knowledge required to answer the question correctly. The compressor automatically generates the prior knowledge needed to answer before compressing the retrieved passages, and then compresses passages autoregressively, injecting the knowledge into the compression process. This process ensures alignment between the question intent and the compressed context. By augmenting this prior knowledge and concise context, the reader models are guided toward relevant answers and trust the context.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: biomedical QA, knowledge base QA
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 359
Loading