Bayesian Data Reweighting Improves Retrieval in Knowledge-Based VQA

Jingchen Sun; Ruiyi Zhang; Shaobo Han; Ryan A. Rossi; Yitao Long; Ming Liu; Naresh Kumar Devulapally; Vishnu Suresh Lokhande; Changyou Chen

Bayesian Data Reweighting Improves Retrieval in Knowledge-Based VQA

Jingchen Sun, Ruiyi Zhang, Shaobo Han, Ryan A. Rossi, Yitao Long, Ming Liu, Naresh Kumar Devulapally, Vishnu Suresh Lokhande, Changyou Chen

04 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multimodal Embedding Retrieval; Bayesian Data Reweighting; Retrieval-Augmented Generation;

Abstract: Knowledge-based Visual Question Answering (VQA) requires retrievers to incorporate external knowledge, e.g., documents, to answer questions. Existing retrievers are typically optimized with standard contrastive learning, which treats all non-positive pairs as equally informative, leading to false negative bias and difficulties in hard negative mining. To overcome these issues, we propose \textbf{Bayesian Data Reweighting (BDR)}, a probabilistic framework that assigns learnable importance weights to query-document pairs and performs Bayesian inference over these weights. We derive closed-form posterior updates under conjugate priors and develop an efficient EM algorithm for weight estimation. This approach adaptively emphasizes informative pairs without explicit hard negative mining. Experiments on two representative multimodal retrievers demonstrate consistent improvements, with BDR achieving gains of up to $8.6$ points on individual datasets and an average recall of $68.6$ across all M2KR datasets, surpassing the previous state-of-the-art.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 2195

Loading