Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization

Zefeng Zhang, Hengzhu Tang, Jiawei Sheng, Zhenyu Zhang, YiMing Ren, Zhenyang Li, Dawei Yin, Duohe Ma, Tingwen Liu

Published: 22 Mar 2025, Last Modified: 12 Nov 2025CVPR 2025EveryoneCC BY 4.0

Abstract: Multimodal Large Language Models (MLLMs) excel in various tasks, yet often struggle with modality bias, where the model tends to rely heavily on a single modality and over- look critical information in other modalities, which leads to incorrect focus and generating irrelevant responses. In this paper, we propose using the paradigm of preference optimization to solve the modality bias problem, including RLAIF-V-Bias, a debiased preference optimization dataset, and a Noise-Aware Preference Optimization (NaPO) algo- rithm. Specifically, we first construct the dataset by intro- ducing perturbations to reduce the informational content of certain modalities, compelling the model to rely on a specific modality when generating negative responses. To address the inevitable noise in automatically constructed data, we combine the noise-robust Mean Absolute Error (MAE) with the Binary Cross-Entropy (BCE) in Direct Preference Opti- mization (DPO) by a negative Box-Cox transformation, and dynamically adjust the algorithm’s noise robustness based on the evaluated noise levels in the data. Extensive exper- iments validate our approach, demonstrating not only its effectiveness in mitigating modality bias but also its signifi- cant role in minimizing hallucinations.