Where is the Truth? The Risk of Getting Confounded in a Continual World

Florian Peter Busch; Roshni Ramanna Kamath; Rupert Mitchell; Wolfgang Stammer; Kristian Kersting; Martin Mundt

Where is the Truth? The Risk of Getting Confounded in a Continual World

Florian Peter Busch, Roshni Ramanna Kamath, Rupert Mitchell, Wolfgang Stammer, Kristian Kersting, Martin Mundt

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We propose a dataset to study confounding in continual learning and show that overcoming forgetting is insufficient to avoid confounding

Abstract: A dataset is confounded if it is most easily solved via a spurious correlation which fails to generalize to new data. In this work, we show that, in a continual learning setting where confounders may vary in time across tasks, the challenge of mitigating the effect of confounders far exceeds the standard forgetting problem normally considered. In particular, we provide a formal description of such continual confounders and identify that, in general, spurious correlations are easily ignored when training for all tasks jointly, but it is harder to avoid confounding when they are considered sequentially. These descriptions serve as a basis for constructing a novel CLEVR-based continually confounded dataset, which we term the ConCon dataset. Our evaluations demonstrate that standard continual learning methods fail to ignore the dataset's confounders. Overall, our work highlights the challenges of confounding factors, particularly in continual learning settings, and demonstrates the need for developing continual learning methods to robustly tackle these.

Lay Summary: In machine learning, models can sometimes rely on shortcuts to make predictions. These shortcuts, called confounders, might work well on training data but often fail when the model sees new, different data. In our research, we explore a particularly difficult situation: when these misleading patterns change over time as the model learns from a sequence of tasks (a setting called continual learning). We introduce the concept of continual confounders, i.e., spurious information that changes between tasks, and show that dealing with them is even more challenging than the well-known problem of forgetting old tasks in continual learning. While confounders are relatively easy to avoid when all tasks are learned at once, they become much trickier to handle when tasks come one at a time. To study this, we built ConCon: a new dataset based on the CLEVR visual reasoning benchmark, designed specifically to test how well models can handle continually changing confounders. We found that existing continual learning methods struggle with this setup. Our findings highlight an important gap in current machine learning techniques and suggest that new approaches are needed to help models learn robustly in the face of changing, misleading information.

Link To Code: https://github.com/ml-research/concon

Primary Area: Deep Learning

Keywords: Continual Learning, Confounding, Computer Vision, Shortcut Learning, Dataset

Submission Number: 9810

Loading