Enhancing Out-of-Distribution Generalization in VQA through Gini Impurity-guided Adaptive Margin Loss

Published: 01 Jan 2024, Last Modified: 13 May 2025ICME 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the Visual Question Answering (VQA) task context, most methods are influenced by language bias, resulting in poor performance on out-of-distribution data. Recently, some works attempted to use the adaptive margin loss to address this bias issue. However, these works typically consider only the frequency of answer labels when designing margin loss, leading to some samples being overly emphasized or lacking sufficient attention during model training. To address this issue, we propose a novel margin loss guided by the Gini-impurity for VQA debiasing. By comprehensively considering label distribution and instance complexity, we use Gini impurity to adjust the margin values in margin loss, balancing the attention of the model to different samples. Importantly, our method is plug-and-play and can be directly applied to any baseline. In the VQA-CP v2 task, our evaluation results across various baselines surpass the current state-of-the-art methods.
Loading