MAR: Medical Asymmetric Retriever for Efficient Chinese Medical Dense Retrieval

ICLR 2026 Conference Submission20011 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Information Retrieval, Medical Text Embedding, Chinese Medical Domain, Asymmetric Architecture
TL;DR: We introduce MedTEB (a high-quality Chinese medical retrieval benchmark) and MAR (a lightweight asymmetric retriever), enabling accurate, low-latency Chinese medical retrieval.
Abstract: Embedding models are critical for domain-specific information retrieval (IR), particularly in healthcare, where accurate, low-latency access to medical knowledge can enhance clinical decision support and mitigate hallucinations in retrieval-augmented generation (RAG) systems. However, Chinese medical retrieval remains underdeveloped due to the absence of high-quality medical retrieval benchmark. To address this limitation, we propose a novel high-quality Chinese **Med**ical **T**ext **E**mbedding **B**enchmark (**MedTEB**), which covers three practical tasks close to real-world scenarios: retrieval, reranking, and semantic textual similarity (STS). We introduce comprehensive LLM-based annotation in the construction process to improve the quality of curated datasets. Through evaluating existing powerful general-purpose embedding models on MedTEB, we demonstrate that MedTEB is a challenging domain-specific embedding benchmark to evaluate models' retrieval capabilities on Chinese medical retrieval. On this foundation, we propose **M**edical **A**symmetric **R**etriever (**MAR**), an asymmetric embedding architecture that decouples query and document encoding: a lightweight encoder handles online queries with minimal latency, while a powerful and offline LLM-based encoder preserves retrieval quality. Optimizing the asymmetric architecture brings to new challenges. We introduce a novel two-stage optimization framework: 1) **query encoder alignment** and 2) **joint fine-tuning**. Through the novel approach, MAR achieves state-of-the-art (SOTA) performance on MedTEB while maintaining lightweight inference speeds comparable to small-size BERT-style embedding models, leading to an excellent trade-off on accuracy and efficiency and thus offering a practicable and effective solution for real-world Chinese medical retrieval scenarios. Our code, data and model will be made publicly available to facilitate future research on domain-specific IR.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 20011
Loading