Causal Interventional Training for Image Recognition

Wei Qin, Hanwang Zhang, Richang Hong, Ee-Peng Lim, Qianru Sun

Published: 01 Jan 2023, Last Modified: 16 May 2023IEEE Trans. Multim. 2023Readers: Everyone

Abstract: Deep learning models often fit undesired dataset bias in training. In this paper, we formulate the bias using <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">causal inference</i> , which helps us uncover the ever-elusive causalities among the key factors in training, and thus pursue the desired causal effect without the bias. We start from revisiting the process of building a visual recognition system, and then propose a structural causal model (SCM) for the key variables involved in dataset collection and recognition model: object, common sense, bias, context, and label prediction. Based on the SCM, one can observe that there are “good” and “bad” biases. Intuitively, in the image where a car is driving on a high way in a desert, the “good” bias denoting the common-sense context is the highway, and the “bad” bias accounting for the noisy context factor is the desert. We tackle this problem with a novel causal interventional training ( <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">CIT</monospace> ) approach, where we control the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">observed</i> context in each object class. We offer theoretical justifications for <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">CIT</monospace> and validate it with extensive classification experiments on CIFAR-10, CIFAR-100 and ImageNet, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">e.g.</i> , surpassing the standard deep neural networks ResNet-34 and ResNet-50, respectively, by 0.95% and 0.70% accuracies on the ImageNet. Our code is open-sourced on the GitHub <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/qinwei-hfut/CIT</uri> .

0 Replies