Debiasing Medical Visual Question Answering via Counterfactual TrainingOpen Website

Published: 01 Jan 2023, Last Modified: 05 Nov 2023MICCAI (2) 2023Readers: Everyone
Abstract: Medical Visual Question Answering (Med-VQA) is expected to predict a convincing answer with the given medical image and clinical question, aiming to assist clinical decision-making. While today’s works have intention to rely on the superficial linguistic correlations as a shortcut, which may generate emergent dissatisfactory clinic answers. In this paper, we propose a novel DeBiasing Med-VQA model with CounterFactual training (DeBCF) to overcome language priors comprehensively. Specifically, we generate counterfactual samples by masking crucial keywords and assigning irrelevant labels, which implicitly promotes the sensitivity of the model to the semantic words and visual objects for bias-weaken. Furthermore, to explicitly prevent the cheating linguistic correlations, we formulate the language prior into counterfactual causal effects and eliminate it from the total effect on the generated answers. Additionally, we initiatively present a newly splitting bias-sensitive Med-VQA dataset, Semantically-Labeled Knowledge-Enhanced under Changing Priors (SLAKE-CP) dataset through regrouping and re-splitting the train-set and test-set of SLAKE into the different prior distribution of answers, dedicating the model to learn interpretable objects rather than overwhelmingly memorizing biases. Experimental results on two public datasets and SLAKE-CP demonstrate that the proposed DeBCF outperforms existing state-of-the-art Med-VQA models and obtains significant improvement in terms of accuracy and interpretability. To our knowledge, it’s the first attempt to overcome language priors in Med-VQA and construct the bias-sensitive dataset for evaluating debiased ability.
0 Replies

Loading