Abstract: Medical question answering (QA) is a reasoning-intensive task that remains challenging for large language models (LLMs) due to hallucinations and outdated domain knowledge. Retrieval-Augmented Generation (RAG) provides a promising post-training solution by leveraging external knowledge. However, existing medical RAG systems suffer from two key limitations: \textbf{(1)} a lack of modeling for human-like reasoning behaviors during information retrieval, and \textbf{(2)} reliance on suboptimal medical corpora, which often results in the retrieval of irrelevant or noisy snippets. To overcome these challenges, we propose \textit{Discuss-RAG}, a plug-and-play module designed to enhance the medical QA RAG system through collaborative agent-based reasoning. Our method introduces a summarizer agent that orchestrates a team of medical experts to emulate multi-turn brainstorming, thereby improving the relevance of retrieved content. Additionally, a decision-making agent evaluates the retrieved snippets before their final integration. Experimental results on four benchmark medical QA datasets show that \textit{Discuss-RAG} consistently outperforms MedRAG, especially significantly improving answer accuracy by up to 16.67\% on BioASQ and 12.20\% on PubMedQA. All code and prompt materials will be made publicly available.
Paper Type: Short
Research Area: Generation
Research Area Keywords: retrieval-augmented generation, biomedical QA
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 816
Loading