GPT-RagAD: Two-layer Retrieval-Augmented Multilingual Diagnosis System
Keywords: Automated Diagnosis, Large Language Models, Retrieval-Augmented Generation, Knowledge Graph, Zero-shot Learning, Medical NLP, Heterogeneous Information Network
Track: Proceedings
Abstract: We introduce GPT-RagAD, a multilingual, zero-shot automated diagnosis system that achieves high accuracy without relying on real patient data. GPT-RagAD adopts a two-layer Retrieval-Augmented Generation (RAG) architecture: a knowledge graph-based retriever selects disease candidates from 1,058 conditions, and an LLM-based re-ranker applies prompt-based reasoning to refine predictions. Unlike traditional diagnostic models that require supervised training and large clinical datasets, GPT-RagAD is privacy-preserving, scalable, and language-agnostic. Extensive evaluations on three multilingual datasets (Chinese and English) show that GPT-RagAD achieves 40.6\% Hit@1 and 56.7\% NDCG@10 on the Symptom2Disease benchmark—substantially outperforming embedding-based and direct LLM baselines. Ablation and sensitivity analyses further validate its robustness. GPT-RagAD presents a practical, lightweight solution for clinical triage and pre-diagnosis support.
General Area: Models and Methods
Specific Subject Areas: Representation Learning
PDF: pdf
Supplementary Material: zip
Data And Code Availability: Yes
Ethics Board Approval: No
Entered Conflicts: I confirm the above
Anonymity: I confirm the above
Code URL: https://github.com/tracy3057/GPT-RagAD
Submission Number: 12
Loading