Keywords: multilingual RAG, culturally grounded QA, agentic retrieval, retrieval-condition adaptation, cross-lingual retrieval, iterative RAG, multicultural retrieval, query rewrite RAG
Abstract: Multilingual retrieval-augmented generation (mRAG) is often implemented within a fixed retrieval space, typically via query or document translation or multilingual embedding vector representations. However, this approach may be inadequate for culturally grounded queries, in which retrieval-condition misalignment may occur. Even strong retrievers and generators may struggle to produce culturally relevant answers when sourcing evidence from inappropriate linguistic or regional contexts. To this end, we introduce **CORAL** (**CO**ntext-aware **R**etrieval with **A**gentic **L**oop, an adaptive retrieval methodology for mRAG that enables iterative refinement of both the retrieval space (corpora) and the retrieval probe (query) based on the quality of the evidence. The overall process includes: (1) selecting corpora, (2) retrieving documents, (3) critiquing evidence for relevance and cultural alignment, and (4) checking sufficiency. If the retrieved documents are insufficient to answer the query correctly, the system (5) reselects corpora and rewrites the query. Across two cultural QA benchmarks, CORAL achieves up to a 3.58\%p accuracy improvement on low-resource languages relative to the strongest baselines.
Paper Type: Long
Research Area: Retrieval-Augmented Language Models
Research Area Keywords: retrieval-augmented generation, multilingual / low resource, multilingualism, LLM/AI Agents
Contribution Types: NLP engineering experiment
Languages Studied: English, Korean, Indonesian, Amharic, Sundanese, Arabic, Hausa, Chinese, Korean, Assamese, Greek, Farsi, Spanish, Azerbaijani
Submission Number: 10538
Loading